Analysis RPG ASEAN

Author

Anna Biller, Johannes Zauner, Christian Cajochen, Marisa A Gerle, Vineetha Kalavally, Anas Mohamed, Lukas Rottländer, Ming-Yi Seah, Oliver Stefani, Manuel Spitschan

1 Preface

This is the analysis supplement to the publication “Physiologically-relevant light exposure and light behaviour in Switzerland and Malaysia”, as per the OSF Preregistration from 18 October 2024, version v1.0.1.

note: Creation of this analysis document based on its original script will fail, unless the recorded renv libraries are restored

In short, light exposure was measured in Basel, Switzerland, and Kuala Lumpur, Malaysia, for one month in 20 individuals per site. Additionally, questionnaires on sleep (PSQI) and light exposure behaviour (LEBA) were collected at specified times. The following hypotheses and research questions were formulated.

1.1 Research questions

  • RQ 1: Are there differences in objectively measured light exposure between the two sites, and if so, in which light metrics?

  • RQ 2: Are there differences in self-reported light exposure patterns using LEBA across time or between the two sites, and if so, in which questions/scores?

  • RQ 3 In general, how are light exposure and LEBA related and are there differences in this relationship between the two sites?

1.2 Hypotheses

For RQ 1, the following hypotheses will be addressed:

  • \(H1\): There are differences in light logger-derived light exposure intensity levels and duration of intensity between Malaysia and Switzerland.

  • \(H0_1\) : No differences between Malaysia and Switzerland.

  • \(H2\): There are differences in light logger-derived timing of light exposure between Malaysia and Switzerland.

  • \(H0_2\): No differences between Malaysia and Switzerland.

For RQ 2, the following hypotheses will be addressed:

  • \(H3\): There are differences in LEBA items and factors between Malaysia and Switzerland.

  • \(H0_3\): No differences between Malaysia and Switzerland.

  • \(H4\): LEBA scores vary over time within participants.

  • \(H0_4\): No differences between Malaysia and Switzerland.

For RQ 3, the following hypotheses will be addressed:

  • \(H5\): LEBA items correlate with preselected light-logger derived light exposure variables.

  • \(H0_5\): No correlation.

  • \(H6\): There is a difference between Malaysia and Switzerland on how well light-logger derived light exposure variables correlate with subjective LEBA items.

  • \(H0_6\): No differences between Malaysia and Switzerland.

2 Summary

This analysis shows several key differences and similarities between the two sites, Malaysia and Switzerland.

  • \(H1\): Swiss participants had significantly more time in daylight levels (above 1000 lx mel EDI) compared to Malaysian participants, and overall also stayed longer in healthy light levels during the day (above 250 lx mel EDI). While the photoperiod was longer in Switzerland than in Malaysia, this difference is still significant when adjusting for photoperiod, where the (unstandardized) effect size is almost a factor of two.

  • \(H2\): The brightest time of day was significantly brighter for Swiss participants compared to Malaysian participants, with a difference of about 0.5 log 10 units. The last time of day with light exposure was also significantly later in Switzerland compared to Malaysia, by about 1.5 hours, which can at least partly be attributed to the longer photoperiod. Swiss participants do, however, also avoid light exposure above 10 lx mel EDI significantly earlier than Malaysian participants, by about 1 hour and 10 minutes. Additionally, Swiss participants average about 1 log 10 unit mel EDI lower during evenings, compared to Malaysia. Finally, Swiss participants are about twice as often in a period of light above 250 lx mel EDI during the day.

  • \(H3\): LEBA questions and factors do not show a significant difference between Malaysia and Switzerland

  • \(H4\): Scores for most of the LEBA items are very stable over time. All 23 questions and 1 out of 5 factors do not vary significantly in more than 50% of participants.

  • \(H5\): LEBA items correlate with preselected light-logger derived light exposure variables in the Switzerland site, but not Malaysia. While exploratory analysis shows correlations in both sites, only Switzerland shows significant correlations after correction for multiple testing. Out of 84 correlations, 10 are significant in Switzerland. The effect size of those correlations is medium on average (r = 0.39).

  • \(H6\): There are few LEBA questions where the correlation coefficient differs significantly between sites. Specifically, these are the questions “I dim my mobile phone screen within 1 hour before attempting to fall asleep” & “I dim my computer screen within 1 hour before attempting to fall asleep”. While the correlation is positive with preselected light exposure metrics in Malaysia (r= 0.25, both), it is zero or negative in Switzerland (r = -0.01 and -0.15, respectively).

3 Setup

Data analysis is performed in R statistical software, mainly using the LightLogR R package, which is developed as part of the MeLiDos project. The document is rendered to HTML via Quarto.

The following packages are used for analysis

#collect all the packages that are needed for the analysis
packages <- c("quarto", "tidyverse", "cowplot", "gt", "gtsummary", 
              "patchwork", "readxl", "ggsci", "ggcorrplot", "LightLogR",
              "rlang", "suntools", "lme4", "lmerTest", "broom.mixed",
              "lattice", "magrittr", "mgcv", "itsadug",
              "labelled", "ordinal", "ggtext", "cowplot", "magick", "ggh4x",
              "rnaturalearthdata")

#check if packages are installed, if not install them
if(!require(pacman)) {
  install.packages("pacman") 
  library(pacman)
}

p_load(packages, character.only = TRUE)

#source functions
source("functions/missing_data_table.R")
source("functions/data_sufficient.R")
source("functions/photoperiod.R")
source("functions/table_lmer.R")
source("functions/Inf_plots.R")
source("functions/inference.R")
source("functions/inference_table.R")

#setting a seed for the date of generation to ensure reproducibility of random processes
seed <- 20241105

#if OpenMP is supported by the executing machine, analysis times for Research Question 2, Hypothesis 2, Patterns, can be sped up dramatically. If not supported, however, it distorts results dramatically and should not be used.
OpenMP_support <- TRUE

3.1 Global parameters

Here we set global parameters for the analysis. Except for the seed, these are specified in the preregistration document.

3.1.1 Coordinates and timezones of the sites

Coordinates are specified as latitude and longitude in decimal degrees. Timezones are chosen from the OlsonNames() function.

coordinates <- list(
#Basel Switzerland, University of Basel
malaysia = c(101.6009, 3.0650),
#Kuala Lumpur Malaysia, Monash University
switzerland = c(7.5839, 47.5585)
)

tzs <- list(
  malaysia = "Asia/Kuala_Lumpur",
  switzerland = "Europe/Zurich")

3.1.2 Illuminance threshold

The upper measurement threshold is set to 130,000 lx. While a lower threshold of 1 lx is specified in the preregistration, it will not be applied in the analysis, as variances of light levels below 1 lx are not relevant for the analysis parameters.

#set a maximum illuminance threshold in lx that is acceptable for the sensor values
illuminance_upper_th <- 1.3*10^5

3.1.3 Random metrics for the sensitivity analysis

The preregistration document specifies the procedure to define a threshold of missing data through a sensitivity analysis, based on three random metrics. These are defined here. The metrics are taken from Table 5 in the preregistration document with the exception of Interdaily Stability (IS), which cannot be calculated on a daily basis as the others.

#choosing three random metrics. This was the original script used to determine the metrics
# metrics <- c("TAT 250", "TAT 1000", "Period above 1000 lx", "M10m", "L5m", "IV", "LLiT 10", "LLit 250", "Frequency crossing threshold", "FLiT 1000", "LE", "M/P ratio")
# 
# metrics_sample <- 
#   sample(metrics, 3)
metrics_sample <- c("Llit 10","M/P ratio", "IV")

4 Import

4.1 Loading data in

4.1.1 Import light exposure data

The data sits in a folder structure that is organized by country and participant ID. Only .csv files that do not contain qualtrics in their file name from the subfolder Malaysia/ are imported. The data is imported using the Speccy import function of the LightLogR package, as this is the device used. The function automatically detects the participant ID from the file name. The timezone set is for Malaysia.

Note: Participant MY006 is missing from the data, as this participant lost the measurement device

#get all files in the Input/Malaysia folder that is inside a subfolder "MY001" to "MY020" and that does not contain "qualtrics" in the file name
base_folder <- "data/Malaysia"
ids <- sprintf("MY%03d", c(1:5, 7:20))
pattern <- "^(?!.*qualtrics).*\\.csv$"
all_folders <- file.path(base_folder, ids)
csv_files <- 
  list.files(
    path = all_folders, pattern = "\\.csv$", full.names = TRUE, recursive = TRUE
    )
csv_files <- csv_files[!str_detect(csv_files, "qualtrics")]
id.pattern <- "MY[0-9]{3}"

#Import the data
data_malaysia <- 
  import$Speccy(csv_files, tz = tzs[["malaysia"]], auto.id = id.pattern)

Successfully read in 733'908 observations across 19 Ids from 40 Speccy-file(s).
Timezone set is Asia/Kuala_Lumpur.
The system timezone is Europe/Berlin. Please correct if necessary!

First Observation: 2023-11-20 10:13:58
Last Observation: 2023-12-30 00:30:00
Timespan: 40 days

Observation intervals: 
   Id    interval.time              n pct  
 1 MY001 60s (~1 minutes)       43330 100% 
 2 MY001 65s (~1.08 minutes)        1 0%   
 3 MY002 59s                        1 0%   
 4 MY002 60s (~1 minutes)       22189 100% 
 5 MY002 61s (~1.02 minutes)        1 0%   
 6 MY002 9978s (~2.77 hours)        1 0%   
 7 MY002 17965s (~4.99 hours)       1 0%   
 8 MY002 32323s (~8.98 hours)       1 0%   
 9 MY002 1202366s (~1.99 weeks)     1 0%   
10 MY003 60s (~1 minutes)       43195 100% 
# ℹ 35 more rows

#renaming the wavelength columns
data_malaysia <- 
  data_malaysia %>% 
  rename_with(\(x) {
    paste0("WL_", x) %>% str_replace("[.][.][.]", "")
  }, 
  `...380`:`...780`
  )

4.1.2 Missing data

The summary shows that all data were collected approximately at the same time, and about half of them are without gaps. The next section will make implicit gaps explicit. This will be done by creating a regular time series from first until last day of measurement. Minutes without an observation will be filled with NA.

#check for gaps (implicit missing data)
data_malaysia %>% gap_finder()
Found 177849 gaps. 597853 Datetimes fall into the regular sequence.
#confirm that the regular epoch is 1 minute
data_malaysia %>% dominant_epoch()
# A tibble: 19 × 3
   Id    dominant.epoch   group.indices
   <fct> <Duration>               <int>
 1 MY001 60s (~1 minutes)             1
 2 MY002 60s (~1 minutes)             2
 3 MY003 60s (~1 minutes)             3
 4 MY004 60s (~1 minutes)             4
 5 MY005 60s (~1 minutes)             5
 6 MY007 60s (~1 minutes)             6
 7 MY008 60s (~1 minutes)             7
 8 MY009 60s (~1 minutes)             8
 9 MY010 60s (~1 minutes)             9
10 MY011 60s (~1 minutes)            10
11 MY012 60s (~1 minutes)            11
12 MY013 60s (~1 minutes)            12
13 MY014 60s (~1 minutes)            13
14 MY015 60s (~1 minutes)            14
15 MY016 60s (~1 minutes)            15
16 MY017 60s (~1 minutes)            16
17 MY018 60s (~1 minutes)            17
18 MY019 60s (~1 minutes)            18
19 MY020 60s (~1 minutes)            19
#bring data into a regular time series of 1 minute
data_malaysia_temp <- 
  data_malaysia %>% mutate(Datetime = round_date(Datetime, "1 minute"))
#check that no Datetime is present twice after rounding
stopifnot("At least one datetime is present twice after rounding" = 
            data_malaysia_temp$Datetime %>% length() %>% {. == nrow(data_malaysia)}
          )

#show a summary of implicitly missing data
implicitly_missing_summary(data_malaysia_temp, 
                           "Implicitly missing data in the Malaysia dataset", 60)
Implicitly missing data in the Malaysia dataset
min max median mean total
Number of gaps 0 7 0 1 21
Duration missed by available data points 0 days 14 days 0 days 1 day 29 days
Duration covered by available data points 11 days 30 days 30 days 26 days 509 days
Number of missing data points 0 21,040 0 2,200 41,796
Number of available data points 16,604 43,996 43,212 38,627 733,908
Percentage of missing data points 0.0% 48.7% 0.0% 5.4% 5.4%
min, max, median, mean, and total values for 19 participants
#make implicit gaps explicit
data_malaysia <- data_malaysia_temp %>% gap_handler(full.days = TRUE)
#check for gaps
data_malaysia %>% gap_finder()
No gaps found
#remove the temporary dataframe
rm(data_malaysia_temp)

#show values above the photopic illuminance thresholds
data_malaysia %>% 
  filter(Photopic.lux > illuminance_upper_th) %>% 
  select(Id, Datetime, Photopic.lux, MEDI) %>% 
  gt(caption = 
       paste("Values above", illuminance_upper_th, "lx photopic illuminance"))
Values above 130000 lx photopic illuminance
Datetime Photopic.lux MEDI
MY014
2023-12-22 14:26:00 133096.7 132775.7
#apply the filter for the upper illuminance threshold
data_malaysia <-
data_malaysia %>% 
  filter(Photopic.lux <= illuminance_upper_th | is.na(Photopic.lux)) %>% 
  gap_handler()

#set illuminance values to 0.1 if they are below 1 lux (as the sensor does not measure below 1 lux, but 0.1 lx can be logarithmically transformed)
data_malaysia <- 
  data_malaysia %>% 
  mutate(MEDI = ifelse(Photopic.lux < 1, 0.1, MEDI))

#show a summary of data missing in general
data_malaysia %>% filter(!is.na(MEDI)) %>% 
  implicitly_missing_summary( 
                           "Missing data in the Malaysia dataset overall", 
                           60)
Missing data in the Malaysia dataset overall
min max median mean total
Number of gaps 0 7 0 1 22
Duration missed by available data points 0 days 14 days 0 days 1 day 29 days
Duration covered by available data points 11 days 30 days 30 days 26 days 509 days
Number of missing data points 0 21,040 0 2,200 41,797
Number of available data points 16,604 43,996 43,211 38,627 733,907
Percentage of missing data points 0.0% 48.7% 0.0% 5.4% 5.4%
min, max, median, mean, and total values for 19 participants

4.1.3 Import LEBA data

In this section the LEBA data is imported. The data is stored in a MYXXX_qualtrics.csv file within each participant’s folder. It contains questionnaire data. Days are coded 1 to X (X being the last day), so the dates need to be connected with the dates from the light data.

#Import the data
leba_folders <- file.path(base_folder, ids)
pattern <- "qualtrics\\.csv$"
leba_files <- 
  list.files(path = leba_folders, pattern = pattern, full.names = TRUE)

leba_malaysia <- 
  read_csv(leba_files, id = "file.path") %>% 
  mutate(Id = str_extract(file.path, id.pattern) %>% as.factor(),
         Day = parse_number(`...1`)
         ) %>% 
  select(Id, Day, starts_with("leba"))
New names:
Rows: 209 Columns: 51
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(50): ...1, leba_F1_01, leba_F1_02, leba_F1_03, leba_F2_04, leba_F2_05, ...
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `Day = parse_number(...1)`.
Caused by warning:
! 19 parsing failures.
row col expected   actual
  1  -- a number Question
 12  -- a number Question
 23  -- a number Question
 34  -- a number Question
 45  -- a number Question
... ... ........ ........
See problems(...) for more details.
#extract the questions from the data
leba_questions <- leba_malaysia[1,-c(1,2)] %>% unlist()

#set the factor levels for leba
leba_levels <- c("Never", "Rarely", "Sometimes", "Often", "Always")

#drop the first row and set the factor levels
leba_malaysia <- 
  leba_malaysia %>%
  drop_na(Day) %>% 
  mutate(across(starts_with("leba"), ~ factor(.x, levels = leba_levels)))

#drop rows with NA
leba_malaysia <- leba_malaysia %>% drop_na()

#set variable labels
var_label(leba_malaysia) <- leba_questions %>% as.list()

4.1.4 Import light exposure data

The Swiss data are structured by participant folders, with one or more .csv files by participant.

#get all files in the data/Basel/Speccy folder that is inside a subfolder "ID01" to "ID20" and in those a .csv file
base_folder <- "data/Basel/Speccy"
subfolders <- sprintf("ID%02d", 1:20)
pattern <- "\\.csv$"
all_folders <- file.path(base_folder, subfolders)
csv_files <- list.files(path = all_folders, pattern = pattern, full.names = TRUE, recursive = TRUE)
id.pattern <- "ID\\d{2}"

#Import the data
data_switzerland <- import$Speccy(csv_files, tz = tzs[["switzerland"]], auto.id = id.pattern)

Successfully read in 799'088 observations across 20 Ids from 49 Speccy-file(s).
Timezone set is Europe/Zurich.
The system timezone is Europe/Berlin. Please correct if necessary!

First Observation: 2023-07-17 17:30:00
Last Observation: 2023-10-18 14:10:36
Timespan: 93 days

Observation intervals: 
   Id    interval.time              n pct  
 1 ID01  60s (~1 minutes)       35968 100% 
 2 ID01  36180s (~10.05 hours)      1 0%   
 3 ID01  67260s (~18.68 hours)      1 0%   
 4 ID01  302520s (~3.5 days)        1 0%   
 5 ID02  60s (~1 minutes)       43530 100% 
 6 ID02  1538s (~25.63 minutes)     1 0%   
 7 ID03  60s (~1 minutes)       43873 100% 
 8 ID03  1488s (~24.8 minutes)      1 0%   
 9 ID03  43282s (~12.02 hours)      1 0%   
10 ID04  60s (~1 minutes)       41554 100% 
# ℹ 39 more rows

#renaming the wavelength columns
data_switzerland <- 
  data_switzerland %>% 
  rename_with(\(x) {
    paste0("WL_", x) %>% str_replace("[.][.][.]", "")
  }, 
  `...380`:`...780`
  )

4.1.5 Missing data

The overview suggests there is implicit missing data, which will be cleaned the following way:

#check for gaps (implicit missing data)
data_switzerland %>% gap_finder()
Found 468822 gaps. 370932 Datetimes fall into the regular sequence.
#confirm that the regular epoch is 1 minute
data_switzerland %>% dominant_epoch()
# A tibble: 20 × 3
   Id    dominant.epoch   group.indices
   <fct> <Duration>               <int>
 1 ID01  60s (~1 minutes)             1
 2 ID02  60s (~1 minutes)             2
 3 ID03  60s (~1 minutes)             3
 4 ID04  60s (~1 minutes)             4
 5 ID05  60s (~1 minutes)             5
 6 ID06  60s (~1 minutes)             6
 7 ID07  60s (~1 minutes)             7
 8 ID08  60s (~1 minutes)             8
 9 ID09  60s (~1 minutes)             9
10 ID10  60s (~1 minutes)            10
11 ID11  60s (~1 minutes)            11
12 ID12  60s (~1 minutes)            12
13 ID13  60s (~1 minutes)            13
14 ID14  60s (~1 minutes)            14
15 ID15  60s (~1 minutes)            15
16 ID16  60s (~1 minutes)            16
17 ID17  60s (~1 minutes)            17
18 ID18  60s (~1 minutes)            18
19 ID19  60s (~1 minutes)            19
20 ID20  60s (~1 minutes)            20
#bring data into a regular time series of 1 minute
data_switzerland_temp <- 
  data_switzerland %>% mutate(Datetime = round_date(Datetime, "1 minute"))
#check that no Datetime is present twice after rounding
stopifnot("At least one datetime is present twice after rounding" = 
            data_switzerland_temp$Datetime %>% length() %>% {. == nrow(data_switzerland)}
          )

#show a summary of implicitly missing data
implicitly_missing_summary(data_switzerland_temp, 
                           "Implicitly missing data in the Swiss dataset", 60)
Implicitly missing data in the Swiss dataset
min max median mean total
Number of gaps 1 4 1 1 29
Duration missed by available data points <1 day 11 days <1 day 1 day 28 days
Duration covered by available data points 7 days 30 days 29 days 27 days 554 days
Number of missing data points 20 16,017 40 2,034 40,680
Number of available data points 11,476 44,588 42,974 39,954 799,088
Percentage of missing data points 0.0% 40.4% 0.1% 5.9% 4.8%
min, max, median, mean, and total values for 20 participants
#make implicit gaps explicit
data_switzerland <- data_switzerland_temp %>% gap_handler(full.days = TRUE)
#check for gaps
data_switzerland %>% gap_finder()
No gaps found
#remove the temporary dataframe
rm(data_switzerland_temp)

#show values above the photopic illuminance thresholds
data_switzerland %>% 
  filter(Photopic.lux > illuminance_upper_th) %>% 
  select(Id, Datetime, Photopic.lux, MEDI) %>% 
  gt(caption = 
       paste("Values above", illuminance_upper_th, "lx photopic illuminance"))
Values above 130000 lx photopic illuminance
Datetime Photopic.lux MEDI
ID01
2023-08-07 15:10:00 1.302806e+05 1.345450e+05
ID03
2023-08-09 16:49:00 1.350211e+05 1.347234e+05
ID07
2023-09-05 13:56:00 1.455028e+05 1.298996e+05
2023-09-06 13:35:00 1.370206e+05 1.223984e+05
ID18
2023-09-26 19:28:00 1.267747e+27 9.370563e+23
2023-09-26 19:32:00 1.267747e+27 9.370563e+23
2023-09-26 19:34:00 1.575518e+15 1.613236e+12
2023-09-26 19:36:00 1.267747e+27 9.370563e+23
#apply the filter for the upper illuminance threshold
data_switzerland <-
data_switzerland %>% 
  filter(Photopic.lux <= illuminance_upper_th | is.na(Photopic.lux)) %>% 
  gap_handler()

#set illuminance values to 0.1 if they are below 1 lux (as the sensor does not measure below 1 lux, but 0.1 lx can be logarithmically transformed)
data_switzerland <- 
  data_switzerland %>% 
  mutate(MEDI = ifelse(Photopic.lux < 1, 0.1, MEDI))

#show a summary of data missing in general
data_switzerland %>% filter(!is.na(MEDI)) %>% 
  implicitly_missing_summary( 
                           "Missing data in the Swiss dataset overall", 
                           60)
Missing data in the Swiss dataset overall
min max median mean total
Number of gaps 1 4 1 2 35
Duration missed by available data points <1 day 11 days <1 day 1 day 28 days
Duration covered by available data points 7 days 30 days 29 days 27 days 554 days
Number of missing data points 21 16,017 44 2,035 40,698
Number of available data points 11,476 44,586 42,974 39,954 799,070
Percentage of missing data points 0.0% 40.4% 0.1% 5.9% 4.8%
min, max, median, mean, and total values for 20 participants

4.1.6 Import LEBA data

In this section the LEBA data is imported. The data is stored in the REDCap_CajochenASEAN_DATA_2023-10-30_1215.csv file in the REDCap folder. It contains questionnaire data. Days are coded 1 to X (X being the last day), so the dates need to be connected with the dates from the light data.

The basel data also contains more columns coded with leba than the Malaysia data. Only the ones contained in both locations will be chosen.

#Import the data
leba_folders <- "data/Basel/REDCap"
leba_file <- "REDCap_CajochenASEAN_DATA_2023-10-30_1215.csv"
leba_files <- paste(leba_folders, leba_file, sep = "/")

leba_switzerland <- 
  read_csv(leba_files, id = "file.path") %>% 
  mutate(Day = parse_number(redcap_event_name)) %>% 
  fill(code) %>% 
  select(Id = code, Day, starts_with("leba"))
Rows: 235 Columns: 142
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr    (4): redcap_event_name, code, leba_weekly_timestamp, psqi_q5j_other_d...
dbl  (125): record_id, teilnahme, registrierung_fr_studie_complete, leba_f1_...
lgl    (7): registrierung_fr_studie_timestamp, screening_scores_leba_weekly_...
dttm   (2): leba_end_timestamp, psqi_timestamp
time   (3): psqi_q1_bedtime, psqi_q3_getuptime, psqi_q4_sleephrs

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#replace the small factor f to an uppercase F
names(leba_switzerland) <- 
names(leba_switzerland) %>% str_replace_all("f(\\d)", "F\\1")

#set the factor levels for leba
leba_levels <- c("Never", "Rarely", "Sometimes", "Often", "Always")

#get the column names
colnames_leba <- names(leba_questions)

#merge the columns with the same name, where one ends with _2, unite the dates
leba_switzerland <- 
  colnames_leba %>% 
  reduce(
    \(y,x) 
    y %>% 
      unite(
        col = !!x, 
        matches(x), matches(paste0(x, "_2")),
        na.rm = TRUE
        ),
    .init = leba_switzerland
    ) %>% 
  unite(col = Datetime, 
        leba_weekly_timestamp, leba_end_timestamp, 
        na.rm = TRUE) %>% 
  select(Id, Day, Datetime, starts_with("leba_f")) %>%
  mutate(across(starts_with("leba"),
           ~ factor(.x, levels = 1:5, labels = leba_levels))) %>% 
  drop_na()

#set variable labels
var_label(leba_switzerland) <- leba_questions %>% as.list()

As the codes in REDCap and the Ids in the light data do not match, we need to enlist a conversion table.

file_conv <- "data/Basel/Participant List Anonymised.xlsx"
#import the conversion file
conv_switzerland <- 
  read_xlsx(file_conv) %>% 
  select(ID, Code) %>% 
  mutate(ID = sprintf("ID%02d", ID))

#one Id in the conversion file does not match with a code in the Leba data
# conversion file: 1998NABR
# leba data: 1999NABR
# we correct the conversion file according to the leba data
conv_switzerland <- 
  conv_switzerland %>% 
  mutate(Code = if_else(Code == "1998NABR", "1999NABR", Code))

#add the conversion table to the leba data
leba_switzerland <- 
  leba_switzerland %>% 
  left_join(conv_switzerland, by = c("Id" = "Code")) %>% 
  select(-Id) %>% 
  rename(Id = ID) %>% 
  drop_na(starts_with("leba")) %>% 
  dplyr::relocate(Id, .before = 1)

4.1.7 Determining the cutoff for required data of participant-days

Each participant-day is required to have at least a set number of datapoints or it will be excluded from the analysis. The procedure on how to determine this cutoff is described in the preregistration document. Data will be aggregated to hourly values and threshold values in 1 hour instances tested, i.e. 1/24 of a day. The minimum threshold is set to 3/24, as calculating the IV requires 3 data points. Participants MY004, MY009, and MY012 were chosen randomly to determine the threshold.

This section took about 40 hrs compute time on a M2 Macbook Pro, 64 GB RAM. Thus for normal execution, bootstrapped results are stored in data/processed, and only analysed here.

#choosing three participants without missing data. 

# n_participants <- 3
# 
# coverage <- 
# data_malaysia %>% filter(!is.na(MEDI)) %>%
#     summarize(
#       length_no_complete = length(Datetime),
#       length_days = length_no_complete/60/24
#     )
# 
# random_participants <- 
# suppressWarnings({
#   data_malaysia %>% 
#     filter(!is.na(MEDI)) %>% 
#     gap_finder(gap.data = TRUE, silent = TRUE) %>% 
#         group_by(Id, .drop = FALSE) %>% 
#         summarise(gaps = max(gap.id)
#         ) %>% 
#         mutate(gaps = ifelse(gaps == -Inf, 0, gaps)) 
#     }) %>% 
#   left_join(coverage, by = "Id") %>% 
#   filter(gaps == 0 & length_days >=29) %>% 
#   slice_sample(n = n_participants) %>% 
#   pull(Id)

random_participants <- c("MY004", "MY009", "MY012")

The chosen metrics are Llit 10, M/P ratio, IV, the chosen participants are MY004, MY009, MY012.

4.1.7.1 Calculating metrics
#filter the dataset
subset_malaysia <- 
  data_malaysia %>% 
  filter(Id %in% random_participants)

#remove the (incomplete) first and last day of measurement
subset_malaysia <- 
  subset_malaysia %>% 
  aggregate_Datetime(unit = "1 hour") %>% 
  mutate(Day = date(Datetime)) %>%
  filter_Datetime(filter.expr = 
                    Day > min(Day) & Day < (max(Day)-days(1)))

subset_malaysia %>% gg_overview()

#metrics function
metrics_function <- function(dataset) {
  dataset %>% 
    summarize(
    MP_ratio = mean(MEDI)/mean(Photopic.lux),
            LLiT10 =
              timing_above_threshold(
              MEDI, Datetime, "below", 10, as.df = TRUE),
            IV = intradaily_variability(MEDI, Datetime),
            .groups = "drop_last"
            ) %>% 
  unnest_wider(col = c(LLiT10)) %>%
  select(-first_timing_below_10, -mean_timing_below_10) %>% 
    mutate(last_timing_below_10 = 
             hms::as_hms(last_timing_below_10) %>% as.numeric)
  }

# calculate the metrics
subset_metrics <-
  subset_malaysia %>%
  group_by(Day, .add = TRUE) %>%
  metrics_function() %>% 
      pivot_longer(
        cols = -c(Day, Id), names_to = "metric", values_to = "value") %>% 
  group_by(metric, Id) %>% 
  summarize(
    mean = mean(value, na.rm = TRUE),
    sd = sd(value, na.rm = TRUE),
    se = sd/sqrt(n()-1),
    qlower = quantile(value, 0.4, na.rm = TRUE),
    qupper = quantile(value, 0.6, na.rm = TRUE),
    upper_Acc = max(value, na.rm = TRUE),
    lower_Acc = min(value, na.rm = TRUE),
    .groups = "drop"
  )
4.1.7.2 creating bootstraps
#for each participant-day, remove the specified number of datapoints
#and calculate the metrics
bootstrap_basis <- 
  subset_malaysia %>% 
  group_by(Id, Day) %>% 
  select(Id, Day, Datetime, MEDI, Photopic.lux)
#thresholds for the bootstrapping
n_bootstraps <- 2
thresholds <- seq(1, 21, by = 1)

#create the bootstraps

#this is commented out in production, because the process takes about 40+ hours of compute-time

#creating 5000 files with 2 bootstraps each
# (1:(0.5*10^4)) %>% walk(\(x) {
# 
#   bootstraps <-
#   tibble(
#     threshold = rep(thresholds, each = n_bootstraps),
#     bootstrap_id = seq_along(threshold),
#          data =
#            threshold %>% map(\(x) bootstrap_basis %>% slice_sample(n = 24-x))
#   ) %>%
#   unnest(data) %>%
#   group_by(threshold, bootstrap_id, Id, Day) %>%
#   arrange(Datetime, .by_group = TRUE) %>%
#   metrics_function() %>%
#   pivot_longer(cols = -c(threshold, bootstrap_id, Id, Day), names_to = "metric", values_to = "value")
# #
# #save these to disk
# save(bootstraps, file = sprintf("data/processed/bootstrap%04d.RData", x))
# 
# })
#then bring the bootstraps together to one file

# bootdata <- tibble()

# walk(1:5000, \(x) {
#   load(sprintf("data/processed/bootstrap%04d.RData", x))
#   bootdata <<- bind_rows(bootdata, bootstraps)
# })

# bootstraps <- bootdata
# save(bootstraps, file = "data/processed/bootstraps.RData")
#bootstrap condensation
# source("scripts/bootstrap_condensation.R")
#if previous chunks for bootstrapping are executed, comment this chunk out
load(file = "data/processed/bootstraps.RData")
#the bootstraps are now summarized at the metric per participant level for each threshold. These are compared to the original metrics and their bounds. bootstrap condensation in the script "scripts/bootstrap_condensation.R"
bootstrap_comparison <-
bootstraps %>%
  left_join(subset_metrics, by = c("Id", "metric"))

#visualize the results
bootstrap_comparison %>%
  ggplot(aes(x = threshold)) +
    geom_ribbon(aes(ymin = (mean - 2*se), ymax = (mean + 2*se)), 
                fill = "steelblue", alpha = 0.4) +
  geom_hline(data = subset_metrics, aes(yintercept=mean), color = "steelblue") +
  geom_errorbar(
    aes(ymin = lower_bs1, ymax = upper_bs1), linewidth = 0.5, width = 0) +
  geom_errorbar(aes(ymin = lower_bs2, ymax = upper_bs2), 
                linewidth = 0.25, width = 0) +
  geom_point(aes(y=mean_bs, 
                 col = ((mean - 2*se) <= lower_bs3 & upper_bs3 <= (mean + 2*se)))) +
  facet_wrap(Id~metric, scales = "free") +
  theme_minimal() +
  labs(x = "Hours missing from the day (Threshold)", y = "Metric value", col = "Within Range") +
  theme(legend.position = "bottom")

Summmary of the bootstraping procedure to determine an acceptable threshold of daily missing data. The blue horizontal line shows the average value of data across the full dataset, the blue rectangle a 95% confidence interval around that average. Dots show the average value across all bootstraps of that respective threshold value. A blue dot indicates that all 10^4 bootstraps lie within the 95% confidence interval of the full dataset. A red dot indicates that at least one bootstrap lies outside of the 95% confidence interval.

The most conservative threshold is chosen based on participant MY009 and the metric last_timing_below_10, which allows up to 6 hours of missing data per day, i.e. 25.00%. For this assumption to hold, the data must be missing at random hours of the day.

missing_threshold <- 18/24

4.1.8 Remove participant-days with insufficient data

The threshold determined in Section 4.1.7 is used to remove participant-days with insufficient data.

#mark data that is above the upper illuminance threshold
data_malaysia <- 
  data_malaysia %>% 
  group_by(Day = date(Datetime), .add = TRUE) %>%
    mutate(marked_for_removal = 
             !data_sufficient(MEDI, missing_threshold),
           .before = 1
          ) %>% 
  ungroup(Day)

#display the final malaysia dataset prior to analysis, where each dot marks the start/end of a day
data_malaysia %>% 
  filter(!marked_for_removal) %>% 
  group_by(Day, .add = TRUE) %>%  
  gg_overview()

#mark data that is above the upper illuminance threshold
data_switzerland <- 
  data_switzerland %>% 
  group_by(Day = date(Datetime), .add = TRUE) %>%
    mutate(marked_for_removal = 
             !data_sufficient(MEDI, missing_threshold),
           .before = 1
          ) %>% 
  ungroup(Day)

#display the final malaysia dataset prior to analysis, where each dot marks the start/end of a day
data_switzerland %>% 
  filter(!marked_for_removal) %>% 
  group_by(Day, .add = TRUE) %>%  
  gg_overview()

#visualize the data as a doubleplot of an average day
data_malaysia %>% 
  filter(!marked_for_removal) %>% 
  ungroup() %>% 
  aggregate_Date(numeric.handler = \(x) mean(x, na.rm = TRUE)) %>% 
  gg_doubleplot(fill = pal_jco()(1))

#visualize the data as a doubleplot of an average day
data_switzerland %>% 
  filter(!marked_for_removal) %>% 
  ungroup() %>% 
  aggregate_Date(numeric.handler = \(x) mean(x, na.rm = TRUE)) %>% 
  gg_doubleplot()

# ggsave("figures/average_day_basel.pdf", height = 27)

### Combine datasets and descriptives

The datasets are combined into one dataset for further analysis.

#combine the light exposure data
data <- list(malaysia = data_malaysia %>% filter(!marked_for_removal), 
             switzerland = data_switzerland %>% filter(!marked_for_removal))

descriptive <- 
data %>% map(\(x) {
  x %>% tbl_summary(include = c(MEDI),
                              by = Id,
                              missing_text = "Missing",
                              statistic = list(
                                               MEDI ~ "{mean} ({min}, {max})")
                              )
})

descriptive[[1]]
Characteristic MY001
N = 41,7601
MY002
N = 20,1601
MY003
N = 41,7601
MY004
N = 41,7601
MY005
N = 41,7601
MY007
N = 41,7601
MY008
N = 41,7601
MY009
N = 41,7601
MY010
N = 34,5601
MY011
N = 40,3201
MY012
N = 43,2001
MY013
N = 41,7601
MY014
N = 41,7601
MY015
N = 41,7601
MY016
N = 31,6801
MY017
N = 43,2001
MY018
N = 15,8401
MY019
N = 43,2001
MY020
N = 15,8401
MEDI 266 (0, 102,316) 124 (0, 49,984) 183 (0, 102,319) 246 (0, 77,476) 207 (0, 104,484) 224 (0, 67,146) 148 (0, 101,702) 512 (0, 109,127) 176 (0, 79,192) 68 (0, 50,185) 127 (0, 58,776) 289 (0, 116,735) 440 (0, 125,709) 134 (0, 34,199) 188 (0, 67,580) 159 (0, 89,517) 175 (0, 62,551) 138 (0, 109,826) 97 (0, 81,403)
    Missing 0 187 0 0 0 0 0 0 114 196 0 25 1 0 57 13 514 370 0
1 Mean (Min, Max)
descriptive[[2]]
Characteristic ID01
N = 33,1201
ID02
N = 41,7601
ID03
N = 41,7601
ID04
N = 40,3201
ID05
N = 40,3201
ID06
N = 41,7601
ID07
N = 43,2001
ID08
N = 41,7601
ID09
N = 8,6401
ID10
N = 40,3201
ID11
N = 41,7601
ID12
N = 40,3201
ID13
N = 41,7601
ID14
N = 41,7601
ID15
N = 23,0401
ID16
N = 41,7601
ID17
N = 41,7601
ID18
N = 41,7601
ID19
N = 40,3201
ID20
N = 40,3201
MEDI 1,294 (0, 130,038) 1,005 (0, 113,549) 1,357 (0, 125,807) 1,337 (0, 114,184) 425 (0, 93,199) 804 (0, 130,088) 1,211 (0, 114,327) 1,047 (0, 119,789) 79 (0, 72,266) 306 (0, 97,892) 1,043 (0, 120,756) 835 (0, 105,709) 1,134 (0, 108,602) 548 (0, 123,006) 187 (0, 112,956) 481 (0, 107,579) 311 (0, 119,461) 1,007 (0, 117,245) 403 (0, 117,124) 77 (0, 93,858)
    Missing 375 25 228 21 23 129 22 25 0 21 51 0 25 26 322 26 29 36 485 34
1 Mean (Min, Max)
#combine the leba data
leba_data <- list(malaysia = leba_malaysia, switzerland = leba_switzerland)
leba_data_combined <- leba_data %>% list_rbind(names_to = "site")
var_label(leba_data_combined) <- leba_questions %>% as.list()

#histogram summary
leba_data_combined %>% 
  pivot_longer(cols = -c(site, Id, Day, Datetime), 
               names_to = "item", values_to = "value") %>% 
  group_by(site, item) %>% 
  nest() %>% 
  ungroup() %>% 
  mutate(question = rep(leba_questions, 2)) %>% 
  unnest_wider(data) %>% 
  select(-c(Id,Day, Datetime)) %>% 
  mutate(value = map(value, \(x) x %>% as.numeric() %>% hist(breaks = (0.5+0:5), plot = FALSE) %>% .$counts)) %>% 
  pivot_wider(id_cols = c(question, item), names_from = site, values_from = value) %>% 
  gt() %>% 
  cols_nanoplot(new_col_name = "Malaysia", 
                columns = c(malaysia), plot_type = "bar",
                options = nanoplot_options(
                  data_bar_fill_color = pal_jco()(1),
                  data_bar_stroke_color = pal_jco()(1)
                )) %>% 
  cols_nanoplot(new_col_name = "Switzerland", 
                columns = c(switzerland), plot_type = "bar",
                options = nanoplot_options(
                  data_bar_fill_color = pal_jco()(2)[2],
                  data_bar_stroke_color = pal_jco()(2)[2]
                )) %>% 
  cols_label(question = "LEBA Item", 
             item = "Item coding")
LEBA Item Item coding Malaysia Switzerland
I wear blue-filtering, orange-tinted, and/or red-tinted glasses indoors during the day leba_F1_01
128 0 128 13 10 11 9
150 0 150 7 8 1 9
I wear blue-filtering, orange-tinted, and/or red-tinted glasses outdoors during the day leba_F1_02
114 0 114 32 7 10 8
149 0 149 8 4 1 13
I wear blue-filtering, orange-tinted, and/or red-tinted glasses within 1 hour before attempting to fall asleep leba_F1_03
135 0 135 18 7 2 9
154 0 154 3 15 2 1
I spend 30 minutes or less per day (in total) outside leba_F2_04
52 0 26 52 45 34 14
56 0 56 51 33 22 13
I spend between 30 minutes and 1 hour per day (in total) outside leba_F2_05
71 0 5 42 71 39 14
49 0 22 49 38 47 19
I spend between 1 and 3 hours per day (in total) outside leba_F2_06
71 0 12 31 71 46 11
43 0 38 39 41 43 14
I spend more than 3 hours per day (in total) outside leba_F2_07
46 0 32 45 41 46 7
57 0 57 46 38 25 9
I spend as much time outside as possible leba_F2_08
55 0 50 55 36 25 5
44 0 40 44 33 34 24
I go for a walk or exercise outside within 2 hours after waking up leba_F2_09
103 0 103 50 13 5 0
71 0 71 53 27 19 5
I use my mobile phone within 1 hour before attempting to fall asleep leba_F3_10
123 0 6 1 4 37 123
111 0 1 9 13 41 111
I look at my mobile phone screen immediately after waking up leba_F3_11
69 0 9 20 31 42 69
70 0 7 16 25 57 70
I check my phone when I wake up at night leba_F3_12
53 0 27 32 53 23 36
80 0 80 38 25 14 18
I look at my smartwatch within 1 hour before attempting to fall asleep leba_F3_13
148 0 148 22 1 0 0
154 0 154 1 6 9 5
I look at my smartwatch when I wake up at night leba_F3_14
160 0 160 11 0 0 0
157 0 157 0 4 12 2
I dim my mobile phone screen within 1 hour before attempting to fall asleep leba_F4_15
47 0 47 17 33 33 41
51 0 51 20 22 38 44
I use a blue-filter app on my computer screen within 1 hour before attempting to fall asleep leba_F4_16
114 0 114 9 3 13 32
123 0 123 2 8 9 33
I use as little light as possible when I get up during the night leba_F4_17
54 0 13 18 46 54 40
106 0 16 12 23 18 106
I dim my computer screen within 1 hour before attempting to fall asleep leba_F4_18
59 0 59 26 26 28 32
103 0 103 17 12 22 21
I use tunable lights to create a healthy light environment leba_F5_19
128 0 128 15 22 5 1
134 0 134 13 10 5 13
I use LEDs to create a healthy light environment leba_F5_20
89 0 89 19 35 15 13
141 0 141 12 9 5 8
I use a desk lamp when I do focused work leba_F5_21
128 0 128 20 5 13 5
138 0 138 9 13 11 4
I use an alarm with a dawn simulation light leba_F5_22
143 0 143 14 12 2 0
166 0 166 0 1 5 3
I turn on the lights immediately after waking up leba_F5_23
59 0 27 39 59 31 15
74 0 74 39 19 21 22
#numeric summary plot
leba_data_combined %>% 
  tbl_summary(include = -c(Id, Day, Datetime), by = site)
Characteristic malaysia
N = 1711
switzerland
N = 1751
I wear blue-filtering, orange-tinted, and/or red-tinted glasses indoors during the day

    Never 128 (75%) 150 (86%)
    Rarely 13 (7.6%) 7 (4.0%)
    Sometimes 10 (5.8%) 8 (4.6%)
    Often 11 (6.4%) 1 (0.6%)
    Always 9 (5.3%) 9 (5.1%)
I wear blue-filtering, orange-tinted, and/or red-tinted glasses outdoors during the day

    Never 114 (67%) 149 (85%)
    Rarely 32 (19%) 8 (4.6%)
    Sometimes 7 (4.1%) 4 (2.3%)
    Often 10 (5.8%) 1 (0.6%)
    Always 8 (4.7%) 13 (7.4%)
I wear blue-filtering, orange-tinted, and/or red-tinted glasses within 1 hour before attempting to fall asleep

    Never 135 (79%) 154 (88%)
    Rarely 18 (11%) 3 (1.7%)
    Sometimes 7 (4.1%) 15 (8.6%)
    Often 2 (1.2%) 2 (1.1%)
    Always 9 (5.3%) 1 (0.6%)
I spend 30 minutes or less per day (in total) outside

    Never 26 (15%) 56 (32%)
    Rarely 52 (30%) 51 (29%)
    Sometimes 45 (26%) 33 (19%)
    Often 34 (20%) 22 (13%)
    Always 14 (8.2%) 13 (7.4%)
I spend between 30 minutes and 1 hour per day (in total) outside

    Never 5 (2.9%) 22 (13%)
    Rarely 42 (25%) 49 (28%)
    Sometimes 71 (42%) 38 (22%)
    Often 39 (23%) 47 (27%)
    Always 14 (8.2%) 19 (11%)
I spend between 1 and 3 hours per day (in total) outside

    Never 12 (7.0%) 38 (22%)
    Rarely 31 (18%) 39 (22%)
    Sometimes 71 (42%) 41 (23%)
    Often 46 (27%) 43 (25%)
    Always 11 (6.4%) 14 (8.0%)
I spend more than 3 hours per day (in total) outside

    Never 32 (19%) 57 (33%)
    Rarely 45 (26%) 46 (26%)
    Sometimes 41 (24%) 38 (22%)
    Often 46 (27%) 25 (14%)
    Always 7 (4.1%) 9 (5.1%)
I spend as much time outside as possible

    Never 50 (29%) 40 (23%)
    Rarely 55 (32%) 44 (25%)
    Sometimes 36 (21%) 33 (19%)
    Often 25 (15%) 34 (19%)
    Always 5 (2.9%) 24 (14%)
I go for a walk or exercise outside within 2 hours after waking up

    Never 103 (60%) 71 (41%)
    Rarely 50 (29%) 53 (30%)
    Sometimes 13 (7.6%) 27 (15%)
    Often 5 (2.9%) 19 (11%)
    Always 0 (0%) 5 (2.9%)
I use my mobile phone within 1 hour before attempting to fall asleep

    Never 6 (3.5%) 1 (0.6%)
    Rarely 1 (0.6%) 9 (5.1%)
    Sometimes 4 (2.3%) 13 (7.4%)
    Often 37 (22%) 41 (23%)
    Always 123 (72%) 111 (63%)
I look at my mobile phone screen immediately after waking up

    Never 9 (5.3%) 7 (4.0%)
    Rarely 20 (12%) 16 (9.1%)
    Sometimes 31 (18%) 25 (14%)
    Often 42 (25%) 57 (33%)
    Always 69 (40%) 70 (40%)
I check my phone when I wake up at night

    Never 27 (16%) 80 (46%)
    Rarely 32 (19%) 38 (22%)
    Sometimes 53 (31%) 25 (14%)
    Often 23 (13%) 14 (8.0%)
    Always 36 (21%) 18 (10%)
I look at my smartwatch within 1 hour before attempting to fall asleep

    Never 148 (87%) 154 (88%)
    Rarely 22 (13%) 1 (0.6%)
    Sometimes 1 (0.6%) 6 (3.4%)
    Often 0 (0%) 9 (5.1%)
    Always 0 (0%) 5 (2.9%)
I look at my smartwatch when I wake up at night

    Never 160 (94%) 157 (90%)
    Rarely 11 (6.4%) 0 (0%)
    Sometimes 0 (0%) 4 (2.3%)
    Often 0 (0%) 12 (6.9%)
    Always 0 (0%) 2 (1.1%)
I dim my mobile phone screen within 1 hour before attempting to fall asleep

    Never 47 (27%) 51 (29%)
    Rarely 17 (9.9%) 20 (11%)
    Sometimes 33 (19%) 22 (13%)
    Often 33 (19%) 38 (22%)
    Always 41 (24%) 44 (25%)
I use a blue-filter app on my computer screen within 1 hour before attempting to fall asleep

    Never 114 (67%) 123 (70%)
    Rarely 9 (5.3%) 2 (1.1%)
    Sometimes 3 (1.8%) 8 (4.6%)
    Often 13 (7.6%) 9 (5.1%)
    Always 32 (19%) 33 (19%)
I use as little light as possible when I get up during the night

    Never 13 (7.6%) 16 (9.1%)
    Rarely 18 (11%) 12 (6.9%)
    Sometimes 46 (27%) 23 (13%)
    Often 54 (32%) 18 (10%)
    Always 40 (23%) 106 (61%)
I dim my computer screen within 1 hour before attempting to fall asleep

    Never 59 (35%) 103 (59%)
    Rarely 26 (15%) 17 (9.7%)
    Sometimes 26 (15%) 12 (6.9%)
    Often 28 (16%) 22 (13%)
    Always 32 (19%) 21 (12%)
I use tunable lights to create a healthy light environment

    Never 128 (75%) 134 (77%)
    Rarely 15 (8.8%) 13 (7.4%)
    Sometimes 22 (13%) 10 (5.7%)
    Often 5 (2.9%) 5 (2.9%)
    Always 1 (0.6%) 13 (7.4%)
I use LEDs to create a healthy light environment

    Never 89 (52%) 141 (81%)
    Rarely 19 (11%) 12 (6.9%)
    Sometimes 35 (20%) 9 (5.1%)
    Often 15 (8.8%) 5 (2.9%)
    Always 13 (7.6%) 8 (4.6%)
I use a desk lamp when I do focused work

    Never 128 (75%) 138 (79%)
    Rarely 20 (12%) 9 (5.1%)
    Sometimes 5 (2.9%) 13 (7.4%)
    Often 13 (7.6%) 11 (6.3%)
    Always 5 (2.9%) 4 (2.3%)
I use an alarm with a dawn simulation light

    Never 143 (84%) 166 (95%)
    Rarely 14 (8.2%) 0 (0%)
    Sometimes 12 (7.0%) 1 (0.6%)
    Often 2 (1.2%) 5 (2.9%)
    Always 0 (0%) 3 (1.7%)
I turn on the lights immediately after waking up

    Never 27 (16%) 74 (42%)
    Rarely 39 (23%) 39 (22%)
    Sometimes 59 (35%) 19 (11%)
    Often 31 (18%) 21 (12%)
    Always 15 (8.8%) 22 (13%)
1 n (%)

4.2 Calculate the photoperiod

In this section, the dataset is extended by by a column indicating one of three states of the day: day, evening, and night. The states are determined by the photoperiod of the location of the data collection. day is defined as the time between sunrise (dawn) and sunset (dusk), evening as the time between sunset and the midnight, and night as the time between midnight and sunrise. Midnight is chosen as a somewhat arbitrary differentiator between states, as sleep timing is not available in the dataset or auxiliary data collection.

#extract the days for the data collection
relevant_days <- 
  data %>% 
  map(~ .x %>% ungroup() %>% pull(Day) %>% unique)

#calculate sunrise and sunset times for the location of the data collection
photoperiods <- 
  names(data) %>% 
  rlang::set_names() %>% 
  map(\(x) {
        photoperiod(coordinates[[x]], relevant_days[[x]], tz = tzs[[x]])
  })

#add the photoperiod to the data
data <- 
  data %>% 
  map2(photoperiods, ~ left_join(.x, .y, by = c("Day" = "date")))

#set categories for the photoperiod
data <- 
  data %>%
  map(\(x) {
    x %>% 
      group_by(Day, .add = TRUE) %>%
      mutate(Photoperiod = case_when(
        Datetime < dawn ~ "night",
        Datetime < dusk ~ "day",
        Datetime >= dusk ~ "evening"),
        photoperiod_duration = sum(Photoperiod == "day")/60
      ) %>% 
      ungroup(Day)
  })

#display an exemplary week with photoperiods

data %>% 
  map2(c("MY001", "ID01"), \(x,y) {
    x %>% 
  filter(!marked_for_removal) %>%
  filter(Id == y) %>% 
  filter_Date(length = "1 week") %>% 
  gg_day(geom = "ribbon", alpha = 0.5, col = "black", aes_fill = Id) + 
  geom_rect(data = \(x) x %>% summarize(dawn = mean(hms::as_hms(dawn)), 
                                         dusk = mean(hms::as_hms(dusk))), 
   aes(xmin = 0, xmax = dawn, ymin = -Inf, ymax = Inf), alpha = 0.25)+
  geom_rect(data = \(x) x %>% summarize(dawn = mean(hms::as_hms(dawn)), 
                                         dusk = mean(hms::as_hms(dusk))), 
   aes(xmin = dusk, xmax = 24*60*60, ymin = -Inf, ymax = Inf), alpha = 0.25)+
  theme(legend.position = "none")+
  labs(title = paste0("Example week with photoperiod for participant ", y))

  })
$malaysia


$switzerland

The dataset spans a period of 39 days in Malaysia and 92 days in Switzerland.

5 Inferential Analysis

5.1 Photoperiod duration

photoperiods <- 
data %>% list_rbind(names_to = "site") %>% 
  group_by(site, Day, photoperiod_duration) %>% 
  summarize(.groups = "drop")

t.test(photoperiod_duration ~ site, data = photoperiods)

    Welch Two Sample t-test

data:  photoperiod_duration by site
t = -10.851, df = 91.007, p-value < 2.2e-16
alternative hypothesis: true difference in means between group malaysia and group switzerland is not equal to 0
95 percent confidence interval:
 -1.998324 -1.379909
sample estimates:
   mean in group malaysia mean in group switzerland 
                 12.70128                  14.39040 

Photoperiods are significantly different between sites. Thus applicable metrics will be corrected by duration of photoperiod (cd). In the case of evening or night data (e.g., TBTe10), correction will be done on their respective duration. This was not specified in the preregistration document.

5.2 Research Question 1

RQ 1: Are there differences in objectively measured light exposure between the two sites, and if so, in which light metrics?

5.2.1 Hypothesis 1

\(H1\): There are differences in light logger-derived light exposure intensity levels and duration of intensity between Malaysia and Switzerland.

5.2.1.1 Metric calculation

In this section, metrics will be calculated for the light exposure data.

The metrics are as follows:

H1:

  • Time above Threshold 250 lx mel EDI (TAT250)

  • Time above Threshold 1000 lx mel EDI (TAT1000)

  • Period above Threshold 1000 lx mel EDI (PAT1000)

  • Time above Threshold 250 lx mel EDI during daytime hours (TATd250)

  • Time below Threshold 10 lx mel EDI during evening hours (TBTe10)

metric_selection <- list(
  H1 = c("TAT250", "TAT1000", "PAT1000", "TATd250", "TBTe10")
)

p_adjustment <- list(
  H1 = 5
)

metrics <- 
  data %>% 
  map(
    \(x) {
      #whole-day metrics
      whole_day <-
      x %>%
        group_by(Id, Day) %>%
        summarize(
          TAT250 =
            duration_above_threshold(MEDI, Datetime, "above", 250, na.rm = TRUE),
          TAT1000 =
            duration_above_threshold(MEDI, Datetime, "above", 1000, na.rm = TRUE),
          PAT1000 =
            period_above_threshold(MEDI, Datetime, "above", 1000, na.rm = TRUE),
          .groups = "drop",
          photoperiod_duration = first(photoperiod_duration)
        ) %>%
        mutate(across(where(is.duration), as.numeric)) %>% 
        pivot_longer(cols = -c(Id, Day, photoperiod_duration), names_to = "metric")
      
      #part-day metrics
      part_day <- 
        x %>% 
        group_by(Id, Day, Photoperiod) %>% 
        summarize(
          TATd250 = 
            duration_above_threshold(MEDI, Datetime, "above", 250, na.rm = TRUE),
          TBTe10 = 
            duration_above_threshold(MEDI, Datetime, "below", 10, na.rm =  TRUE),
          .groups = "drop",
          photoperiod_duration = first(photoperiod_duration)
        ) %>% 
        mutate(across(where(is.duration), as.numeric)) %>% 
        pivot_longer(cols = -c(Id, Day, Photoperiod,
                               photoperiod_duration), names_to = "metric") %>% 
        filter((Photoperiod == "day" & metric == "TATd250") |
               (Photoperiod == "evening" & metric == "TBTe10"))
      
      whole_day %>% 
        bind_rows(part_day)
    }
  ) %>% 
  list_rbind(names_to = "site") %>% 
  nest(data = -metric)
5.2.1.2 Model fitting

All models for Hypothesis showed very poor model diagnostics under a gaussian error distribution. According to the parameter metrics, the poisson family is a much more appropriate error distribution.

formula_H1 <- value_dc ~ site + (1|site:Id)
formula_H0 <- value_dc ~ 1 + (1|site:Id)

map <- purrr::map

inference_H1 <- 
  metrics %>% 
  filter(metric %in% metric_selection$H1) %>% 
  mutate(data = 
           data %>% 
           purrr::map(\(x) x %>% mutate(value_dc = 
                                   (value/photoperiod_duration) %>% round()))
         )

inference_H1 <- 
  inference_H1 %>% 
  inference_summary(formula_H1, formula_H0, p_adjustment = p_adjustment$H1,
                    family = poisson())


H1_table <- 
Inference_Table(inference_H1, p_adjustment = p_adjustment$H1, value = value_dc)

H1_table <-
H1_table %>%
  tab_footnote(
    "Exponentiated beta coefficients from the final model, denoting the multiplication factor for the intercept, conditional on the site",
    locations = cells_column_labels(columns = c("malaysia", "switzerland"))
  ) %>%
  tab_footnote(
    "Model prediction for the intercept per hour of photo- or nighttime period, reference level for the site is Malaysia",
    locations = cells_column_labels(columns = "Intercept")
  ) %>%
  fmt_duration(columns = Intercept, input_units = "seconds") %>%
  fmt_number("Intercept", decimals = 0) %>%
  fmt_number("switzerland", decimals = 3) %>%
  tab_header(title = "Model Results for Hypothesis 1", )

H1_table
Model Results for Hypothesis 1
p-value1 Intercept2
Site coefficients
Malaysia3 Switzerland3
TAT250 0.002 418 1 1.780
TAT1000 0.011 180 1 1.942
PAT1000 0.15 78
TATd250 0.003 410 1 1.772
TBTe10


0.9


0.9

678
1 p-values are adjusted for multiple comparisons using the false-discovery-rate for n= 5 comparisons
2 Model prediction for the intercept per hour of photo- or nighttime period, reference level for the site is Malaysia
3 Exponentiated beta coefficients from the final model, denoting the multiplication factor for the intercept, conditional on the site
v1 <- gt::extract_cells(H1_table, switzerland, 1) %>% as.numeric() %>% round(3)
v2 <- gt::extract_cells(H1_table, switzerland, 2) %>% as.numeric() %>% round(3)
v3 <- gt::extract_cells(H1_table, switzerland, 4) %>% as.numeric() %>% round(3)

The model summary shows that swiss participants have significantly more time above threshold for 250 (TAT250) and 1000 lx (TAT1000) than participants in Malaysia, i.e., x1.78 and x1.942, respectively (x1.772 over 250 lx mel EDI during daytime hours, TATd250).

5.2.1.3 Model diagnostics
Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value_dc, "exp")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value_dc, "exp")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value_dc, "exp")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value_dc, "exp")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value_dc, "exp")

5.2.2 Hypothesis 2

\(H2\): There are differences in light logger-derived timing of light exposure between Malaysia and Switzerland.

5.2.2.1 Metric calculation

H2:

  • M10m

  • L5m

  • IS

  • IV

  • LLiT 10

  • LLiT 250

  • Frequency crossing Threshold 250

  • mean logarithmic melanopic EDI day & night

metric_selection <- append(metric_selection, list(
    H2_timing = c("M10m", "L5m", "IS", "IV", "LLiT10", "LLiT250", "FcT250"),
  H2_interaction = "mean"
)
)

p_adjustment <- append(
  p_adjustment,
  list(
  H2 = 9
))

metrics <- 
  rbind(
    metrics,
      data %>% 
  map(
    \(x) {
      #whole-day metrics
      whole_day <-
      x %>%
        group_by(Id, Day) %>%
        summarize(
          .groups = "drop",
          M10m =
            bright_dark_period(
                  MEDI, Datetime, "brightest", "10 hours", 
                  as.df = TRUE, na.rm = TRUE
                  ) %>% pull(brightest_10h_mean) %>% log10(),
          L5m =
            bright_dark_period(
                  MEDI, Datetime, "darkest", "5 hours", as.df = TRUE,
                  loop = TRUE, na.rm = TRUE
                  ) %>% pull(darkest_5h_mean) %>% log10(),
          IV = intradaily_variability(MEDI, Datetime),
          LLiT10 =
            timing_above_threshold(
              MEDI, Datetime, "above", 10, as.df = TRUE) %>%
            pull(last_timing_above_10) %>% hms::as_hms() %>% as.numeric(),
          LLiT250 =
            timing_above_threshold(MEDI, Datetime, "above", 250, as.df = TRUE) %>%
            pull(last_timing_above_250) %>% hms::as_hms() %>% as.numeric(),
          FcT250 =
            frequency_crossing_threshold(MEDI, 250, na.rm = TRUE),
        ) %>%
        mutate(across(where(is.duration), as.numeric)) %>% 
        pivot_longer(cols = -c(Id, Day), names_to = "metric")
      
      #part-day metrics
      part_day <- 
        x %>% 
        group_by(Id, Day, Photoperiod) %>% 
        summarize(
          .groups = "drop",
          mean = mean(MEDI, na.rm = TRUE) %>% log10()
        ) %>% 
        pivot_longer(cols = -c(Id, Day, Photoperiod), names_to = "metric") %>% 
        filter(Photoperiod != "night" & metric == "mean")
      
      #no-day metrics
      no_day <- 
          x %>% 
        summarize(IS = interdaily_stability(MEDI, Datetime, na.rm = TRUE)) %>% 
        pivot_longer(cols = IS, names_to = "metric")
      
      whole_day %>% 
        bind_rows(part_day, no_day)
    }
  ) %>% 
  list_rbind(names_to = "site") %>% 
  nest(data = -metric)
  )

Hypothesis 2 is analyzed in three distinct ways. The first part is similar to hypothesis 1 but looks at different metrics, that are associated with the timing of light (Timing). The second part looks at the possible interaction of light during the day to the evening (Interaction). And the third part looks at a model of light across the whole day as a smooth pattern (Pattern).

5.2.2.2 Timing
formula_H2_timing <- value ~ site + (1|site:Id)
formula_H2_IV <- value ~ site
formula_H0 <- value ~ 1 + (1|site:Id)
formula_H0_IV <- value ~ 1

lmer <- lme4::lmer

inference_H2 <- 
  metrics %>% 
  filter(metric %in% metric_selection$H2_timing) %>% 
  mutate(formula_H1 = case_match(metric,
                                 "IS" ~ c(formula_H2_IV),
                                 .default = c(formula_H2_timing)),
         formula_H0 = case_match(metric,
                                 "IS" ~ c(formula_H0_IV),
                                 .default = c(formula_H0)),
         type = (case_match(metric,
                                 "IS" ~ "lm",
                                 .default = "lmer")),
         family = list(gaussian())
         )


inference_H2 <-
  inference_H2 %>% 
  inference_summary2(p_adjustment = p_adjustment$H2)

H2_table_timing <- 
Inference_Table(inference_H2, p_adjustment = p_adjustment$H2, value = value) %>%
  fmt_number("Intercept", decimals = 2) %>%
  fmt_number("switzerland", decimals = 2) %>%
  cols_align(align = "center", columns = "p.value") %>%
  tab_header(title = "Model Results for Hypothesis 2, Timing", ) %>% 
  fmt_duration(columns = c(Intercept, switzerland), 
               input_units = "seconds",
               rows = 4:5, duration_style = "colon-sep") %>% 
  tab_footnote(
    "Values were log10 transformed before model fitting",
    locations = cells_stub(rows = 1:2)
  )

H2_table_timing
Model Results for Hypothesis 2, Timing
p-value1 Intercept
Site coefficients
Malaysia Switzerland
M10m2 0.004 2.36 0 0.46
L5m2 0.2 −0.88
IV


0.9


0.9

1.28
LLiT10 0.005 23:16:14 0 −01:10:42
LLiT250 <0.001 17:41:23 0 01:27:54
FcT250 0.005 35.79 0 28.71
IS


0.9


0.9

0.16
1 p-values are adjusted for multiple comparisons using the false-discovery-rate for n= 9 comparisons
2 Values were log10 transformed before model fitting
v1 <-  gt::extract_cells(H2_table_timing, Intercept, 1) %>% as.numeric()
v2 <-  gt::extract_cells(H2_table_timing, switzerland, 1) %>% as.numeric()
v3 <-  gt::extract_cells(H2_table_timing, switzerland, 4)
v4 <-  gt::extract_cells(H2_table_timing, switzerland, 5)
v5 <-  gt::extract_cells(H2_table_timing, switzerland, 6) %>% as.numeric()
v6 <-  gt::extract_cells(H2_table_timing, Intercept, 6) %>% as.numeric()

The model summary shows that the 10 brightest hours (M10m) of swiss participants are significantly brighter than for participants in Malaysia, i.e., an average of 661 lx and 229 lx, respectively and the frequency of crossing 250 lx is about 2 times as high for swiss participants compared to malaysian participants (64 compared to 36, respectively). Swiss participants last time above 250 lx melanopic EDI (LLiT250) is about 1.5 hours later compared to malaysian participants (+01:27:54), and swiss participants avoid values above 10 lx after sundown (LLiT10) about 1 hour earlier compared to malaysia (−01:10:42).

5.2.2.2.1 Model diagnostics
Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z %>% drop_na(value), value, transformation = "identity")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z %>% drop_na(value), value, transformation = "identity")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z %>% drop_na(value), value, transformation = "identity")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z %>% drop_na(value), value, transformation = "identity")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z %>% drop_na(value), value, transformation = "identity")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z %>% drop_na(value), value, transformation = "identity")

# Inf_plots1(x, z)
Inf_plots2(inference_H2$model[[7]], inference_H2$data[[7]])

Inf_plots3(inference_H2$model[[7]], inference_H2$data[[7]], 
           value, transformation = "identity")

5.2.2.3 Interaction
formula_H2_interaction <- value ~ site*Photoperiod + (1+Photoperiod|site:Id)
formula_H0_interaction <- value ~ site + Photoperiod + (1+Photoperiod|site:Id)

inference_H2.2 <- 
  tibble(metric = "log10(daily mean mel EDI)",
         data = metrics %>% filter(metric == "mean") %>% pull(data)) %>% 
  mutate(formula_H1 = c(formula_H2_interaction),
         formula_H0 = c(formula_H0_interaction),
         type = "lmer"
         )

inference_H2.2 <-
  inference_H2.2 %>%
  inference_summary2(p_adjustment = 1)

H2_table_interaction <-
Inference_Table(inference_H2.2, p_adjustment = 1, value = value) %>%
  fmt_number(
    c("Intercept", "switzerland", Photoperiodevening, "switzerland:Photoperiodevening"), 
    decimals = 2) %>%
  cols_hide(c("cor__Intercept.Photoperiodevening", "sd__Photoperiodevening")) %>%
  cols_align(align = "center", columns = "p.value") %>%
  tab_header(title = "Model Results for Hypothesis 2, Interaction", ) %>%
  tab_footnote(
    "Values were log10 transformed before model fitting"
  ) %>% 
  tab_footnote(
    "p-value refers to the interaction effect",
    locations = cells_column_labels(p.value)
  ) %>% 
  cols_label(
    Photoperiodevening = "Evening",
    `switzerland:Photoperiodevening` = "Switzerland:Evening") %>% 
  cols_add("Day" = 0, .before = Photoperiodevening) %>% 
  tab_spanner("Time coeff.", columns = c(Day, Photoperiodevening)) %>% 
  tab_spanner("Interaction coeff.", columns = c(`switzerland:Photoperiodevening`)) %>% 
  tab_style(
    style = list(
      cell_text(weight = "bold")
    ),
    locations = cells_column_spanners()
  ) %>% 
  tab_style(
    style = list(
      cell_text(weight = "bold")
    ),
    locations = cells_column_labels()
  ) %>% 
    tab_style(
      style = cell_text(color = pal_jco()(2)[2]),
      locations = cells_column_labels(c("switzerland:Photoperiodevening"))
    )

H2_table_interaction
Model Results for Hypothesis 2, Interaction
p-value1,2 Intercept
Site coefficients
Time coeff.
Interaction coeff.
Malaysia Switzerland Day Evening Switzerland:Evening
log10(daily mean mel EDI) <0.001 2.23 0 0.41 0 −0.98 −1.11
Values were log10 transformed before model fitting
1 p-values are adjusted for multiple comparisons using the false-discovery-rate for n= 1 comparisons
2 p-value refers to the interaction effect
v1 <-  gt::extract_cells(H2_table_interaction, 3:8) %>% as.numeric()
v2 <- 10^(v1[1]+v1[3]) %>% round(0)
v3 <- 10^(v1[1]+v1[3] + inference_H2.2$summary[[1]]$estimate[3] + 
            inference_H2.2$summary[[1]]$estimate[4])  %>% round(1)
v4 <- 10^(v1[1]) %>% round(0)
v5 <- 10^(v1[1] + inference_H2.2$summary[[1]]$estimate[3]) %>% round(1)

The model reveals that swiss participants have brighter days and darker evenings compared to the malaysia site. Specifically, model prediction is that a swiss participant has a daily mean melanopic EDI during daytime hours of 437 lx, and 3.5 lx during evening hours. An average malaysian participant reaches a mean 170 lx during daytime, and 17.6 lx during evening hours.

5.2.2.3.1 Model diagnostics
Inf_plots1(inference_H2.2$model[[1]], inference_H2.2$data[[1]])

Inf_plots2(inference_H2.2$model[[1]], inference_H2.2$data[[1]])
`geom_smooth()` using formula = 'y ~ x'

Inf_plots3(inference_H2.2$model[[1]], inference_H2.2$data[[1]] %>% 
             drop_na(value), value, transformation = "identity"
           )

5.2.2.4 Pattern
5.2.2.4.1 Metric calculation
metrics <-
  rbind(
    metrics,
    data %>%
      map(\(x) {
        x %>%
          select(Id, Datetime, MEDI) %>%
          mutate(Time = hms::as_hms(Datetime) %>% as.numeric() / 3600,
                 Day = date(Datetime) %>% factor(),
                 Id_day = interaction(Id, Day),
                 metric = "MEDI")
      }) %>%
      list_rbind(names_to = "site") %>%
      mutate(site = factor(site)) %>%
      nest(data = -metric)
  )
5.2.2.4.2 Model fitting
formula_H2_pattern <- log10(MEDI) ~ site + s(Time, by = site, bs = "cc", k=12) + s(Id, by = site, bs = "re")

formula_H0_pattern <- log10(MEDI) ~ s(Time, bs = "cc", k = 12) + s(Id, by = site, bs = "re")

#setting the ends for the cyclic smooth
knots_day <- list(Time = c(0, 24))

#Model generation
Pattern_model0 <- 
  bam(formula_H0_pattern, 
      data = metrics %>% filter(metric == "MEDI") %>% pull(data) %>% .[[1]], 
      knots = knots_day,
      samfrac = 0.1,
      discrete = OpenMP_support,
      nthreads = 4,
      control = list(nthreads = 4))
Warning in bam(formula_H0_pattern, data = metrics %>% filter(metric == "MEDI")
%>% : openMP not available: single threaded computation only
Pattern_model <- 
  bam(formula_H2_pattern, 
      data = metrics %>% filter(metric == "MEDI") %>% pull(data) %>% .[[1]], 
      knots = knots_day,
      samfrac = 0.1,
      nthreads = 4,
      discrete = OpenMP_support,
      control = list(nthreads = 4))
Warning in bam(formula_H2_pattern, data = metrics %>% filter(metric == "MEDI")
%>% : openMP not available: single threaded computation only
#Model performance
AICs <- 
AIC(Pattern_model, Pattern_model0)
AICs
                     df     AIC
Pattern_model  59.96930 4516274
Pattern_model0 49.97899 4568575
Pattern_model_sum <- 
summary(Pattern_model)
Pattern_model_sum

Family: gaussian 
Link function: identity 

Formula:
log10(MEDI) ~ site + s(Time, by = site, bs = "cc", k = 12) + 
    s(Id, by = site, bs = "re")

Parametric coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)      0.45316    0.05400   8.391   <2e-16 ***
siteswitzerland  0.04480    0.09509   0.471    0.638    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                           edf Ref.df     F p-value    
s(Time):sitemalaysia     9.988     10 37307  <2e-16 ***
s(Time):siteswitzerland  9.993     10 79088  <2e-16 ***
s(Id):sitemalaysia      17.988     18  1623  <2e-16 ***
s(Id):siteswitzerland   18.994     19  2570  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.454   Deviance explained = 45.4%
fREML = 2.2583e+06  Scale est. = 1.2648    n = 1469740
# plot(Pattern_model)

plot_smooth(
  Pattern_model,
  view = "Time",
  plot_all = "site",
  rug = F,
  n.grid = 90,
  col = pal_jco()(2),
  rm.ranef = "s(Id)",
  sim.ci = TRUE
  )
Summary:
    * site : factor; set to the value(s): malaysia, switzerland. 
    * Time : numeric predictor; with 200 values ranging from 0.000000 to 23.983333. 
    * Id : factor; set to the value(s): MY012. 
    * NOTE : The following random effects columns are canceled: s(Time):sitemalaysia,s(Time):siteswitzerland,s(Id):sitemalaysia,s(Id):siteswitzerland
 
    * Simultaneous 95%-CI used : 
        Critical value: 2.364
        Proportion posterior simulations in pointwise CI: 0.87 (10000 samples)
        Proportion posterior simulations in simultaneous CI: 0.95 (10000 samples)
 

plot_diff(Pattern_model,
          view = "Time", 
          rm.ranef = "s(Id)",
          comp = list(site = c("malaysia", "switzerland")),
          sim.ci = TRUE)
Summary:
    * Time : numeric predictor; with 200 values ranging from 0.000000 to 23.983333. 
    * Id : factor; set to the value(s): MY012. 
    * NOTE : The following random effects columns are canceled: s(Time):sitemalaysia,s(Time):siteswitzerland,s(Id):sitemalaysia,s(Id):siteswitzerland
 
    * Simultaneous 95%-CI used : 
        Critical value: 2.09
        Proportion posterior simulations in pointwise CI: 1 (10000 samples)
        Proportion posterior simulations in simultaneous CI: 1 (10000 samples)
 


Time window(s) of significant difference(s):
    0.000000 - 4.097655
    7.351675 - 19.765159
    20.970352 - 23.983333
5.2.2.5 Model Diagnostics
gam.check(Pattern_model, rep = 100)


Method: fREML   Optimizer: perf chol
$grad
[1]  2.384107e-06  4.512179e-05  9.553069e-10  3.769318e-07 -4.802388e-05

$hess
           [,1]          [,2]          [,3]          [,4]          [,5]
   4.986919e+00  3.767262e-25  2.662657e-06  1.672765e-21     -4.994000
  -1.256410e-25  4.991019e+00  4.980802e-23  4.495121e-07     -4.996344
   2.662657e-06 -7.133444e-23  8.987577e+00 -3.167441e-19     -8.993804
  -7.640457e-23  4.495121e-07  3.483758e-21  9.488433e+00     -9.497046
d -4.994000e+00 -4.996344e+00 -8.993804e+00 -9.497046e+00 734869.000048

Model rank =  100 / 100 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

                           k'   edf k-index p-value
s(Time):sitemalaysia    10.00  9.99    1.01    0.78
s(Time):siteswitzerland 10.00  9.99    1.01    0.85
s(Id):sitemalaysia      39.00 17.99      NA      NA
s(Id):siteswitzerland   39.00 18.99      NA      NA
formula_H2_pattern <- log10(MEDI) ~ 
  site + s(Time, by = site, bs = "cc", k=12) + 
  s(Time, by = Id, bs = "cc", k = 12) + 
  s(Id, by = site, bs = "re")

formula_H0_pattern <- log10(MEDI) ~ 
  s(Time, bs = "cc", k = 12) + 
  s(Time, by = Id, bs = "cc", k = 12) + 
  s(Id, by = site, bs = "re")

formula_Hm1_pattern <- log10(MEDI) ~ 
  s(Time, by = Id, bs = "cc", k = 12) + 
  s(Id, by = site, bs = "re")

#setting the ends for the cyclic smooth
knots_day <- list(Time = c(0, 24))

#Model generation
Pattern_model0 <- 
  bam(formula_H0_pattern, 
      data = metrics %>% filter(metric == "MEDI") %>% pull(data) %>% .[[1]], 
      knots = knots_day,
      samfrac = 0.1,
      discrete = OpenMP_support,
      nthreads = 4,
      control = list(nthreads = 4))
Warning in bam(formula_H0_pattern, data = metrics %>% filter(metric == "MEDI")
%>% : openMP not available: single threaded computation only
Pattern_modelm1 <- 
  bam(formula_Hm1_pattern, 
      data = metrics %>% filter(metric == "MEDI") %>% pull(data) %>% .[[1]], 
      knots = knots_day,
      samfrac = 0.1,
      discrete = OpenMP_support,
      nthreads = 4,
      control = list(nthreads = 4))
Warning in bam(formula_Hm1_pattern, data = metrics %>% filter(metric == :
openMP not available: single threaded computation only
Pattern_modelm2 <- 
  bam(formula_H2_pattern, 
      data = metrics %>% filter(metric == "MEDI") %>% pull(data) %>% .[[1]], 
      knots = knots_day,
      samfrac = 0.1,
      nthreads = 4,
      discrete = OpenMP_support,
      control = list(nthreads = 4))
Warning in bam(formula_H2_pattern, data = metrics %>% filter(metric == "MEDI")
%>% : openMP not available: single threaded computation only
#Model performance
AICs <- 
AIC(Pattern_modelm2, Pattern_model0, Pattern_modelm1)
AICs
                      df     AIC
Pattern_modelm2 427.2126 4334753
Pattern_model0  427.3398 4334745
Pattern_modelm1 428.0966 4334743
Pattern_model_sum <- 
summary(Pattern_modelm2)
Pattern_model_sum

Family: gaussian 
Link function: identity 

Formula:
log10(MEDI) ~ site + s(Time, by = site, bs = "cc", k = 12) + 
    s(Time, by = Id, bs = "cc", k = 12) + s(Id, by = site, bs = "re")

Parametric coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)      0.45343    0.05409   8.383   <2e-16 ***
siteswitzerland  0.04433    0.09500   0.467    0.641    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                              edf Ref.df        F p-value    
s(Time):sitemalaysia    9.154e+00     10   10.872  <2e-16 ***
s(Time):siteswitzerland 9.966e+00     10 5989.456  <2e-16 ***
s(Time):IdMY001         9.669e+00     10   26.664  <2e-16 ***
s(Time):IdMY002         9.790e+00     10   31.793  <2e-16 ***
s(Time):IdMY003         7.647e+00     10    3.150  <2e-16 ***
s(Time):IdMY004         9.581e+00     10   22.191  <2e-16 ***
s(Time):IdMY005         9.432e+00     10   14.535  <2e-16 ***
s(Time):IdMY007         9.632e+00     10   25.656  <2e-16 ***
s(Time):IdMY008         9.765e+00     10   32.729  <2e-16 ***
s(Time):IdMY009         8.340e+00     10    4.870  <2e-16 ***
s(Time):IdMY010         9.825e+00     10   41.545  <2e-16 ***
s(Time):IdMY011         9.688e+00     10   29.210  <2e-16 ***
s(Time):IdMY012         9.680e+00     10   27.202  <2e-16 ***
s(Time):IdMY013         8.956e+00     10    8.589  <2e-16 ***
s(Time):IdMY014         9.014e+00     10    9.024  <2e-16 ***
s(Time):IdMY015         9.737e+00     10   30.400  <2e-16 ***
s(Time):IdMY016         9.587e+00     10   22.074  <2e-16 ***
s(Time):IdMY017         9.579e+00     10   18.883  <2e-16 ***
s(Time):IdMY018         9.680e+00     10   20.209  <2e-16 ***
s(Time):IdMY019         9.828e+00     10   46.858  <2e-16 ***
s(Time):IdMY020         9.270e+00     10    9.067  <2e-16 ***
s(Time):IdID01          9.885e+00     10  271.241  <2e-16 ***
s(Time):IdID02          9.887e+00     10   92.928  <2e-16 ***
s(Time):IdID03          9.974e+00     10  430.152  <2e-16 ***
s(Time):IdID04          9.976e+00     10 1263.553  <2e-16 ***
s(Time):IdID05          9.939e+00     10  195.099  <2e-16 ***
s(Time):IdID06          9.748e+00     10  144.720  <2e-16 ***
s(Time):IdID07          9.908e+00     10  145.609  <2e-16 ***
s(Time):IdID08          9.909e+00     10  230.906  <2e-16 ***
s(Time):IdID09          9.683e+00     10 1242.976  <2e-16 ***
s(Time):IdID10          9.801e+00     10  332.364  <2e-16 ***
s(Time):IdID11          9.955e+00     10  265.852  <2e-16 ***
s(Time):IdID12          9.873e+00     10   95.177  <2e-16 ***
s(Time):IdID13          9.965e+00     10  334.103  <2e-16 ***
s(Time):IdID14          9.919e+00     10  434.586  <2e-16 ***
s(Time):IdID15          9.853e+00     10  397.519  <2e-16 ***
s(Time):IdID16          1.555e-04     10    0.000  <2e-16 ***
s(Time):IdID17          9.935e+00     10  635.199  <2e-16 ***
s(Time):IdID18          9.819e+00     10  239.344  <2e-16 ***
s(Time):IdID19          9.897e+00     10  265.421  <2e-16 ***
s(Time):IdID20          9.954e+00     10  789.975  <2e-16 ***
s(Id):sitemalaysia      1.799e+01     18 1837.369  <2e-16 ***
s(Id):siteswitzerland   1.899e+01     19 2895.488  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.518   Deviance explained = 51.8%
fREML = 2.1686e+06  Scale est. = 1.1176    n = 1469740
#Model overview
# plot(Pattern_model, shade = TRUE, residuals = TRUE, cex = 1, all.terms = TRUE)

colors_pattern <- pal_jco()(2)

plot_smooth(
  Pattern_modelm1,
  view = "Time",
  plot_all = "Id",
  rm.ranef = FALSE,
  se = 0,
  rug = F,
  n.grid = 90,
  col = c(rep(colors_pattern[1], 19),rep(colors_pattern[2], 20)),
  # sim.ci = TRUE
  )
Summary:
    * Time : numeric predictor; with 90 values ranging from 0.000000 to 23.983333. 
    * Id : factor with 39 values; set to the value(s): ID01, ID02, ID03, ID04, ID05, ID06, ID07, ID08, ID09, ID10, ... 
    * site : factor; set to the value(s): switzerland. 

5.2.2.6 Model Diagnostics
gam.check(Pattern_model)


Method: fREML   Optimizer: perf chol
$grad
[1]  2.384107e-06  4.512179e-05  9.553069e-10  3.769318e-07 -4.802388e-05

$hess
           [,1]          [,2]          [,3]          [,4]          [,5]
   4.986919e+00  3.767262e-25  2.662657e-06  1.672765e-21     -4.994000
  -1.256410e-25  4.991019e+00  4.980802e-23  4.495121e-07     -4.996344
   2.662657e-06 -7.133444e-23  8.987577e+00 -3.167441e-19     -8.993804
  -7.640457e-23  4.495121e-07  3.483758e-21  9.488433e+00     -9.497046
d -4.994000e+00 -4.996344e+00 -8.993804e+00 -9.497046e+00 734869.000048

Model rank =  100 / 100 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

                           k'   edf k-index p-value
s(Time):sitemalaysia    10.00  9.99    1.02    0.95
s(Time):siteswitzerland 10.00  9.99    1.02    0.94
s(Id):sitemalaysia      39.00 17.99      NA      NA
s(Id):siteswitzerland   39.00 18.99      NA      NA

Using the formula specified in the preregistration document reveals no significant difference between the sites (∆AIC < 2 for the more complex model). Furthermore, a reduced model only including individual smooths per participant (Pattern_modelm1) reveals no common pattern over time, i.e. the patterns vary so strongly between participants, that the model suffers when trying to extract common values at a given time point.

To nonetheless illustrate the overall trend between the sites, a simplified model is used, that only implements random intercepts for the participants (compared to random smooths in the preregistration variant). This allows for an overall comparison between the sites, even though it can not be mapped 1:1 on individual participants. Here, there is strong evidence for the effect of site, and the difference is significant across most of the day. The difference-plot shows that malaysian participants show lower MEDI values during the day compared to swiss participants, and higher ones at night.

5.3 Research Question 2

RQ 2: Are there differences in self-reported light exposure patterns using LEBA across time or between the two sites, and if so, in which questions/scores?

As RQ relates to LEBA questions and factors, factors have to be calculated first.

5.3.0.1 Metric calculation
p_adjustment <- append(
  p_adjustment,
  list(
  H3 = 23+5
))


#Factor calculation
factors_leba <- list(
  F1 = names(leba_questions)[c(1,2,3)],
  F2 = names(leba_questions)[c(4:9)],
  F3 = names(leba_questions)[c(10:14)],
  F4 = names(leba_questions)[c(15:18)],
  F5 = names(leba_questions)[c(19:23)]
)
Fs <- paste0("F", 1:5)

leba_data_combined <-
leba_data_combined %>% 
  mutate(leba_F2_04 = fct_rev(leba_F2_04), 
         leba_F1 = rowSums(across(contains(factors_leba$F1), as.numeric)),
         leba_F2 = rowSums(across(contains(factors_leba$F2), as.numeric)),
         leba_F3 = rowSums(across(contains(factors_leba$F3), as.numeric)),
         leba_F4 = rowSums(across(contains(factors_leba$F4), as.numeric)),
         leba_F5 = rowSums(across(contains(factors_leba$F5), as.numeric)),
         ) %>% 
  mutate(leba_F2_04 = fct_rev(leba_F2_04),
         across(c(site, Id), factor))

leba_factors <- c(
  leba_F1 = "Wearing blue light filtering glasses indoors and outdoors",
  leba_F2 = "Spending time outdoors",
  leba_F3 = "Using phones and smartwatches in bed before sleep",
  leba_F4 = "Controlling and using ambient light before bedtime",
  leba_F5 = "Using light in the morning and during daytime"
)

#set variable labels
var_label(leba_data_combined) <- leba_factors %>% as.list()

leba_metrics <- 
  leba_data_combined %>% 
  mutate(across(contains("leba_F"), as.numeric)) %>%
  pivot_longer(cols = starts_with("leba_F"), names_to = "metric", values_to = "value") %>% 
  nest(data = -metric)

leba_metrics <- 
leba_metrics %>% 
  rowwise() %>% 
  mutate(data = list({
    if(str_length(metric) == 7) data
    else data %>% mutate(value = factor(value, levels = 1:5, labels = leba_levels))
  }
  ))

5.3.1 Hypothesis 3

\(H3\): There are differences in LEBA items and factors between Malaysia and Switzerland.

5.3.1.1 Model fitting
5.3.1.1.1 Whole Dataset
formula_H3 <- value ~ site + (1|site:Id)
formula_H0 <- value ~ 1 + (1|site:Id)

clmm <- ordinal::clmm

inference_H3 <- 
leba_metrics %>% 
  filter(str_length(metric) != 7) %>% 
  mutate(formula_H1 = c(formula_H3),
         formula_H0 = c(formula_H0),
         type = "clmm")

inference_H3 <-
  inference_H3 %>% 
    rowwise() %>% 
    mutate(
      model = 
        list(
            do.call(type, list(formula = formula_H1,
                                          data = data,
                               nAGQ = 5))
            ),
      model0 = list(
        do.call(type, list(formula = formula_H0,
                           data = data,
                           nAGQ = 5))
      ))
Warning: There were 2 warnings in `mutate()`.
The first warning was:
ℹ In argument: `model = list(do.call(type, list(formula = formula_H1, data =
  data, nAGQ = 5)))`.
ℹ In row 16.
Caused by warning in `update.uC()`:
! Non finite negative log-likelihood
  at iteration 81
ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
inference_H3$comparison <- vector("list", nrow(inference_H3))

for(i in 1:nrow(inference_H3)) {
  comparison <- anova(inference_H3$model[[i]], inference_H3$model0[[i]])
  inference_H3$comparison[[i]] <- comparison
}

inference_H3 <- 
inference_H3 %>%
  mutate(
      p.value_adj = comparison$`Pr(>Chisq)` %>% .[2] %>%
        p.adjust(method = "fdr", n = p_adjustment$H3),
      significant = p.value_adj <= 0.05,
      summary = list(
        switch(significant+1,
               tidy(model0),
               tidy(model))),
    table = list(
        tibble(
          metric = metric,
          p.value =
            p.value_adj %>%
            style_pvalue() %>%
            {ifelse(significant, paste0("**", ., "**"),.)},
          summary %>%
            mutate(estimate = estimate) %>%
            select(term, estimate) %>%
            filter(term != "sd__(Intercept)" &
                     term != "sd__Observation") %>%
            mutate(term = str_remove_all(term, "\\(|\\)|site")) %>%
            pivot_wider(names_from = term, values_from = estimate)
        ) %>% 
          mutate(malaysia = if(exists("switzerland")) 0 else NA)
  )
  )

inference_H3 %>% pull(table) %>% list_rbind() %>% select(metric, p.value) %>% gt()
metric p.value
leba_F1_01 >0.9
leba_F1_02 >0.9
leba_F1_03 >0.9
leba_F2_04 >0.9
leba_F2_05 >0.9
leba_F2_06 >0.9
leba_F2_07 >0.9
leba_F2_08 >0.9
leba_F2_09 0.8
leba_F3_10 >0.9
leba_F3_11 >0.9
leba_F3_12 0.7
leba_F3_13 >0.9
leba_F3_14 >0.9
leba_F4_15 >0.9
leba_F4_16 >0.9
leba_F4_17 0.5
leba_F4_18 >0.9
leba_F5_19 >0.9
leba_F5_20 0.9
leba_F5_21 >0.9
leba_F5_22 0.3
leba_F5_23 >0.9

Under the strict p-value adjustment for H3 (n=28), none of the behavioral questions are significantly different.

5.3.1.1.2 Only last collection point
formula_H3 <- value ~ site
formula_H0 <- value ~ 1

clmm <- ordinal::clmm
clm <- ordinal::clm

inference_H3.2 <- 
leba_metrics %>% 
  filter(str_length(metric) != 7) %>% 
  mutate(formula_H1 = c(formula_H3),
         formula_H0 = c(formula_H0),
         type = "clm",
         data = list(data %>% filter(Day == 31)))

inference_H3.2 <-
  inference_H3.2 %>% 
    rowwise() %>% 
    mutate(
      model = 
        list(
            do.call(type, list(formula = formula_H1,
                                          data = data))
            ),
      model0 = list(
        do.call(type, list(formula = formula_H0,
                           data = data))
      ))

inference_H3.2$comparison <- vector("list", nrow(inference_H3.2))

for(i in 1:nrow(inference_H3.2)) {
  comparison <- anova(inference_H3.2$model[[i]], inference_H3.2$model0[[i]])
  inference_H3.2$comparison[[i]] <- comparison
}

inference_H3.2 <- 
inference_H3.2 %>%
  mutate(
      p.value_adj = comparison$`Pr(>Chisq)` %>% .[2] %>%
        p.adjust(method = "fdr", n = p_adjustment$H3),
      significant = p.value_adj <= 0.05,
      summary = list(
        switch(significant+1,
               tidy(model0),
               tidy(model))),
    table = list(
        tibble(
          metric = metric,
          p.value =
            p.value_adj %>%
            style_pvalue() %>%
            {ifelse(significant, paste0("**", ., "**"),.)},
          summary %>%
            mutate(estimate = estimate) %>%
            select(term, estimate) %>%
            filter(term != "sd__(Intercept)" &
                     term != "sd__Observation") %>%
            mutate(term = str_remove_all(term, "\\(|\\)|site")) %>%
            pivot_wider(names_from = term, values_from = estimate)
        ) %>% 
          mutate(malaysia = if(exists("switzerland")) 0 else NA)
  )
  )

inference_H3.2 %>% pull(table) %>% list_rbind() %>% select(metric, p.value) %>% gt()
metric p.value
leba_F1_01 >0.9
leba_F1_02 >0.9
leba_F1_03 >0.9
leba_F2_04 >0.9
leba_F2_05 >0.9
leba_F2_06 >0.9
leba_F2_07 >0.9
leba_F2_08 >0.9
leba_F2_09 >0.9
leba_F3_10 >0.9
leba_F3_11 >0.9
leba_F3_12 0.055
leba_F3_13 >0.9
leba_F3_14 >0.9
leba_F4_15 >0.9
leba_F4_16 >0.9
leba_F4_17 >0.9
leba_F4_18 >0.9
leba_F5_19 >0.9
leba_F5_20 >0.9
leba_F5_21 >0.9
leba_F5_22 >0.9
leba_F5_23 >0.9
formula_H3 <- value ~ site + (1|site:Id)
formula_H0 <- value ~ 1 + (1|site:Id)

inference_H3.3 <-
  leba_metrics %>%
  filter(str_length(metric) == 7) %>%
  mutate(formula_H1 = c(formula_H3),
         formula_H0 = c(formula_H0),
         type = "lmer",
         family = list(poisson())
         )

inference_H3.3 <-
  inference_H3.3 %>%
  inference_summary2(p_adjustment = p_adjustment$H3)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
inference_H3.3 %>% pull(table) %>% list_rbind() %>% select(metric, p.value) %>% gt()
metric p.value
leba_F1 >0.9
leba_F2 >0.9
leba_F3 >0.9
leba_F4 >0.9
leba_F5 >0.9

Under the strict p-value adjustment for H3 (n=28), none of the behavioral factors are significantly different between sites.

5.3.1.1.3 Model diagnostics
Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value, "identity")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value, "identity")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value, "identity")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value, "identity")

Inf_plots1(x, z)

Inf_plots2(x, z)

Inf_plots3(x, z, value, "identity")

5.3.2 Hypothesis 4

\(H4\): LEBA scores vary over time within participants

For this hypothesis, only the malaysian data is used, as specified in the preregistration document.

bootstraps <- 10^4

leba_metrics_H4 <- 
leba_metrics %>% 
  mutate(data = list(data %>% filter(site == "malaysia")))

boot_plot <- function(data) {
  data %>% 
  ggplot(aes(x = CI95lower/(1/3))) + 
  geom_histogram(aes(fill = CI95lower/(1/3) >= 0.5), binwidth = 1, col = "white", alpha = 0.75) +
  scale_x_continuous(breaks = c(0:10)) + 
  theme_void() + 
  labs(x = "lower CI: Number of varying answers", y = "Frequency")+
  theme(axis.text = element_text(),
        legend.position = "none",
        axis.title.x = element_text(),
        panel.grid.major.y = element_line(colour = "grey80")) + 
  scale_y_continuous(labels = scales::label_percent(scale = 100/19),
                     breaks = c(0, 50/100*19, 25/100*19, 75/100*19, 100/100*19))+
  coord_cartesian(ylim = c(0, 100/100*19))
}

leba_metrics_H4 <- 
  leba_metrics_H4 %>%
  rowwise() %>% 
  mutate(bootstrap =
           list(data %>%
               select(Id, value) %>%
               mutate(value = as.numeric(value)) %>%
               expand_grid(bootstrap = seq(bootstraps)) %>%
               group_by(Id, bootstrap) %>%
               slice_sample(prop = 1, replace = TRUE) %>%
               summarize(sd = sd(value), .groups = "drop_last") %>%
               summarize(
                 mean_sd = mean(sd),
                 CI95lower = quantile(sd, 0.025),
                 CI95_upper = quantile(sd, 0.975),
                 .groups = "drop_last"
               ) %>%
               mutate(significant = ifelse(CI95lower == 0, FALSE, TRUE)) %>%
               summarize(
                 percent_significant = mean(significant),
                 plot = list(tibble(CI95lower) %>% boot_plot())
               )
           )
         )

labels_leba <- 
var_label(leba_data_combined)[-c(1:3)] %>% enframe() %>% unnest(value)

leba_results_H4 <-
  leba_metrics_H4 %>%
  unnest(bootstrap) %>%
  select(-data) %>%
  left_join(labels_leba, by = c("metric" = "name")) %>%
  rename(question = value) %>% 
  group_by(quartile =
             cut(
               percent_significant,
               breaks = c(-0.1, 0.25, 0.5, 0.75, 1.1),
               labels =
                 paste(
                   "stable over time in",
                   c("75-100%", "50-75%", "25-50%", "25-0%"),
                   "of participants"
                 )
             ))

leba_results_H4 %>%
  relocate(question, .before = 1) %>%
  arrange(percent_significant) %>%
  gt() %>%
  text_transform(
    locations = cells_body(columns = plot),
    fn = function(x)
      leba_results_H4$plot %>% ggplot_image(height = px(200))
  ) %>%
  fmt_percent(columns = percent_significant) %>%
  cols_label(plot = "",
             percent_significant = "Participants with significant variation",
             question = "Item/Factor",
             metric = "Metric code") %>%
  cols_align(align = "center", columns = -question) %>%
  tab_footnote(
    paste(
      "based on",
      bootstraps,
      "bootstraps of individual participants answers across the four weeks of data collection. If the lower threshold for a 95% confidence interval is above 0, the participant is considered to have significant variation in their answers."
    ),
    locations = cells_column_labels(columns = "percent_significant")
  ) %>%
  tab_header(title = "Model Results for Hypothesis 4") %>%
  tab_style(
    style = cell_text(weight = "bold"),
    locations = list(cells_column_labels(), cells_stub())
  )
Model Results for Hypothesis 4
Item/Factor Metric code Participants with significant variation1
stable over time in 75-100% of participants
I look at my smartwatch when I wake up at night leba_F3_14 0.00%
I wear blue-filtering, orange-tinted, and/or red-tinted glasses indoors during the day leba_F1_01 5.26%
I wear blue-filtering, orange-tinted, and/or red-tinted glasses within 1 hour before attempting to fall asleep leba_F1_03 5.26%
I go for a walk or exercise outside within 2 hours after waking up leba_F2_09 5.26%
I look at my smartwatch within 1 hour before attempting to fall asleep leba_F3_13 5.26%
I wear blue-filtering, orange-tinted, and/or red-tinted glasses outdoors during the day leba_F1_02 10.53%
I use my mobile phone within 1 hour before attempting to fall asleep leba_F3_10 10.53%
Wearing blue light filtering glasses indoors and outdoors leba_F1 10.53%
I use a desk lamp when I do focused work leba_F5_21 15.79%
I use an alarm with a dawn simulation light leba_F5_22 15.79%
I use tunable lights to create a healthy light environment leba_F5_19 21.05%
stable over time in 50-75% of participants
I look at my mobile phone screen immediately after waking up leba_F3_11 26.32%
I use a blue-filter app on my computer screen within 1 hour before attempting to fall asleep leba_F4_16 26.32%
I spend between 1 and 3 hours per day (in total) outside leba_F2_06 31.58%
I spend more than 3 hours per day (in total) outside leba_F2_07 31.58%
I spend as much time outside as possible leba_F2_08 31.58%
I use LEDs to create a healthy light environment leba_F5_20 31.58%
I spend between 30 minutes and 1 hour per day (in total) outside leba_F2_05 36.84%
I use as little light as possible when I get up during the night leba_F4_17 36.84%
I turn on the lights immediately after waking up leba_F5_23 36.84%
I check my phone when I wake up at night leba_F3_12 42.11%
I dim my mobile phone screen within 1 hour before attempting to fall asleep leba_F4_15 42.11%
I spend 30 minutes or less per day (in total) outside leba_F2_04 47.37%
I dim my computer screen within 1 hour before attempting to fall asleep leba_F4_18 47.37%
stable over time in 25-50% of participants
Using light in the morning and during daytime leba_F5 73.68%
stable over time in 25-0% of participants
Using phones and smartwatches in bed before sleep leba_F3 78.95%
Spending time outdoors leba_F2 84.21%
Controlling and using ambient light before bedtime leba_F4 94.74%
1 based on 10000 bootstraps of individual participants answers across the four weeks of data collection. If the lower threshold for a 95% confidence interval is above 0, the participant is considered to have significant variation in their answers.
leba_H4_table <- 
leba_results_H4 %>% select(metric, quartile) %>% nest(data = metric) %>%
  arrange(quartile) %>% 
  mutate(data = map(data, ~.x %>% pull(metric) %>% paste(collapse = ", "))) %>% 
  unnest(data) %>% 
  mutate(data = str_replace_all(data, "leba_F", "Factor "),
         data = str_replace_all(data, "_", ": Item ")) %>%
  gt() %>% 
  tab_style(
    style = cell_text(weight = "bold"),
    locations = cells_row_groups()
  ) %>% 
  tab_options(column_labels.hidden = TRUE) %>% 
  tab_header(title = "Hypothesis 4: Extent of stable LEBA items and factors over time")
  
leba_H4_table
Hypothesis 4: Extent of stable LEBA items and factors over time
stable over time in 75-100% of participants
Factor 1: Item 01, Factor 1: Item 02, Factor 1: Item 03, Factor 2: Item 09, Factor 3: Item 10, Factor 3: Item 13, Factor 3: Item 14, Factor 5: Item 19, Factor 5: Item 21, Factor 5: Item 22, Factor 1
stable over time in 50-75% of participants
Factor 2: Item 04, Factor 2: Item 05, Factor 2: Item 06, Factor 2: Item 07, Factor 2: Item 08, Factor 3: Item 11, Factor 3: Item 12, Factor 4: Item 15, Factor 4: Item 16, Factor 4: Item 17, Factor 4: Item 18, Factor 5: Item 20, Factor 5: Item 23
stable over time in 25-50% of participants
Factor 5
stable over time in 25-0% of participants
Factor 2, Factor 3, Factor 4
leba_H4_table %>% gtsave("figures/Table_H4.png")
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b266860eab0.html screenshot completed
v1 <- leba_results_H4 %>% summarize(length = n())

As the two summary tables show, the majority of items is very stable over time in our sample. 24 items and factors don’t show significant variance in over half of participants. The factors are less stable over time and with the exception of Factor 1, all show variance in over 75% of participants. This is to be expected, as the factor calculation across items is more likely to show variance than individual items. The next section summarizes the average changes in and standard deviation of the LEBA metrics across the four weeks of data collection.

#average change
leba_changes <- 
leba_metrics %>% 
  mutate(data = list(data %>% filter(site == "malaysia")),
         data = list(data %>% mutate(value = as.numeric(value)) %>% 
                       group_by(Id) %>% 
                       summarize(sd = sd(value), mean = mean(value),
                                 cv = sd/mean, .groups = "drop_last") %>% 
                       summarize(mean_sd = mean(sd),
                                 mean_cv = mean(cv))))
         
leba_changes %>% unnest(data) %>% select(metric, mean_sd) %>% 
  mutate(average_change = mean_sd/(1/3),
         average_change = average_change %>% round()) %>%
  left_join(labels_leba, by = c("metric" = "name")) %>%
  relocate(question = value, .before = 1) %>% 
  gt() %>% 
  fmt_number(columns = mean_sd, decimals = 2) %>% 
    tab_style(
    style = cell_text(weight = "bold"),
    locations = cells_column_labels()
  ) %>% 
  cols_label(
             question = "Item/Factor",
             metric = "Metric code",
             mean_sd = "SD",
             average_change = "Average change") %>%
  tab_footnote(
    "Average standard deviation across participants",
    locations = cells_column_labels(columns = "mean_sd")
  ) %>% 
  tab_footnote(
        "Average change in scores total across the four weeks of data collection. A value of 1 indicates that across the 9 times of data collection, only once did a participant's score change by 1 point. A value of 2 indicates a score change once by 2 points, or two times by 1, and so forth.",
    locations = cells_column_labels(columns = "average_change")
  ) %>% 
  tab_header(title = "Average change in LEBA metrics across four weeks of data collection")
Average change in LEBA metrics across four weeks of data collection
Item/Factor Metric code SD1 Average change2
I wear blue-filtering, orange-tinted, and/or red-tinted glasses indoors during the day leba_F1_01 0.10 0
I wear blue-filtering, orange-tinted, and/or red-tinted glasses outdoors during the day leba_F1_02 0.18 1
I wear blue-filtering, orange-tinted, and/or red-tinted glasses within 1 hour before attempting to fall asleep leba_F1_03 0.09 0
I spend 30 minutes or less per day (in total) outside leba_F2_04 0.57 2
I spend between 30 minutes and 1 hour per day (in total) outside leba_F2_05 0.54 2
I spend between 1 and 3 hours per day (in total) outside leba_F2_06 0.46 1
I spend more than 3 hours per day (in total) outside leba_F2_07 0.51 2
I spend as much time outside as possible leba_F2_08 0.39 1
I go for a walk or exercise outside within 2 hours after waking up leba_F2_09 0.24 1
I use my mobile phone within 1 hour before attempting to fall asleep leba_F3_10 0.27 1
I look at my mobile phone screen immediately after waking up leba_F3_11 0.46 1
I check my phone when I wake up at night leba_F3_12 0.57 2
I look at my smartwatch within 1 hour before attempting to fall asleep leba_F3_13 0.08 0
I look at my smartwatch when I wake up at night leba_F3_14 0.07 0
I dim my mobile phone screen within 1 hour before attempting to fall asleep leba_F4_15 0.62 2
I use a blue-filter app on my computer screen within 1 hour before attempting to fall asleep leba_F4_16 0.27 1
I use as little light as possible when I get up during the night leba_F4_17 0.70 2
I dim my computer screen within 1 hour before attempting to fall asleep leba_F4_18 0.55 2
I use tunable lights to create a healthy light environment leba_F5_19 0.28 1
I use LEDs to create a healthy light environment leba_F5_20 0.49 1
I use a desk lamp when I do focused work leba_F5_21 0.25 1
I use an alarm with a dawn simulation light leba_F5_22 0.19 1
I turn on the lights immediately after waking up leba_F5_23 0.54 2
Wearing blue light filtering glasses indoors and outdoors leba_F1 0.32 1
Spending time outdoors leba_F2 1.40 4
Using phones and smartwatches in bed before sleep leba_F3 1.03 3
Controlling and using ambient light before bedtime leba_F4 1.33 4
Using light in the morning and during daytime leba_F5 1.28 4
1 Average standard deviation across participants
2 Average change in scores total across the four weeks of data collection. A value of 1 indicates that across the 9 times of data collection, only once did a participant's score change by 1 point. A value of 2 indicates a score change once by 2 points, or two times by 1, and so forth.

5.4 Research Question 3

RQ 3 In general, how are light exposure and LEBA related and are there differences in this relationship between the two sites?

5.4.0.1 Metric calculation

Additional metrics need to be calculated:

  • First time above threshold 250 lx mel EDI (FLiT250)
  • First time above threshold 1000 lx mel EDI (FLiT1000)
  • L5 without sleep period (night), i.e. day and evening (L5mde)
  • L5 only night time (L5mn)
  • Melanopic/photopic ratio (MPratio)
  • Light exposure (dose of light) for melanopic EDI (LE)
  • Spectral contribution to melanopic EDI (SCmelEDI) 1
p_adjustment <- append(
  p_adjustment,
  list(
  H5 = 2*84,
  H6 = 23+5
))

metrics <-
  rbind(
    metrics,
    data %>%
map(
  \(x) {
      #whole-day metrics
      whole_day <-
      x %>%
        group_by(Id, Day) %>%
        summarize(
          .groups = "drop",
          FLiT1000 =
            timing_above_threshold(
              MEDI, Datetime, "above", 1000, as.df = TRUE) %>%
            pull(first_timing_above_1000) %>% hms::as_hms() %>% as.numeric(),
          FLiT250 =
            timing_above_threshold(MEDI, Datetime, "above", 250, as.df = TRUE) %>%
            pull(first_timing_above_250) %>% hms::as_hms() %>% as.numeric(),
          MPratio = mean(MEDI, na.rm = TRUE)/mean(Photopic.lux, na.rm = TRUE),
          LE = sum(MEDI, na.rm = TRUE)/60,
          SCmelEDI = mean(MEDI, na.rm = TRUE)/753.35/
            mean(rowSums(across(starts_with("WL_")))*1000, na.rm = TRUE)
        ) %>%
        mutate(across(where(is.duration), as.numeric)) %>% 
        pivot_longer(cols = -c(Id, Day), names_to = "metric")
      
      #part-day metrics
      part_day <-
        x %>%
        group_by(Id, Day, Night = Photoperiod == "night") %>% 
        summarize(
          .groups = "drop",
          L5m =
            bright_dark_period(
                  MEDI, Datetime, "darkest", "5 hours", as.df = TRUE, 
                  na.rm = TRUE
                  ) %>% pull(darkest_5h_mean) %>% log10()
        ) %>% 
        pivot_longer(cols = -c(Id, Day, Night), names_to = "metric") %>% 
        mutate(metric = case_when(
          metric == "L5m" & Night == TRUE ~ "L5mn",
          metric == "L5m" & Night == FALSE ~ "L5mde"
        ))
      
      whole_day %>% 
        bind_rows(part_day)
    }
  ) %>% 
  list_rbind(names_to = "site") %>% 
  nest(data = -metric)
  )

#nest the datasets by site
metricsRQ3 <-
  metrics %>%
  rowwise() %>% 
  mutate(data = list(data %>% 
                       mutate(across(any_of("Day"), as_date))
                     )
         ) %>% 
  unnest(data) %>%
  nest(data = -c(site, metric)) %>%
  nest(data = -site)

#select relevant metrics by question
metric_selection$H5 <-
  list(
    leba_F1_01 = c(),
    leba_F1_02 = c(),
    leba_F1_03 = c(),
    leba_F2_04 = c("M10m", "TAT250", "TAT1000", "PAT1000", "IV", "IS"),
    leba_F2_05 = c("M10m", "TAT250", "TAT1000", "PAT1000", "IV", "IS"),
    leba_F2_06 = c("M10m", "TAT250", "TAT1000", "PAT1000", "IV", "IS"),
    leba_F2_07 = c("M10m", "TAT250", "TAT1000", "PAT1000", "IV", "IS"),
    leba_F2_08 = c("M10m", "TAT250", "TAT1000", "PAT1000", "IV", "IS"),
    leba_F2_09 = c("FLiT1000", "IV", "IS"),
    leba_F3_10 = c("LLiT10", "LLiT250", "L5mn", "IV", "IS"),
    leba_F3_11 = c("FLiT250", "IV", "IS"),
    leba_F3_12 = c("L5m", "IV", "IS"),
    leba_F3_13 = c("LLiT10", "LLiT250", "L5mn", "IV", "IS"),
    leba_F3_14 = c("L5m", "IV", "IS"),
    leba_F4_15 = c("LLiT10", "LLiT250", "L5mde", "IV", "IS"),
    leba_F4_16 = c("LLiT10", "LLiT250", "L5mde", "IV", "IS"),
    leba_F4_17 = c("L5m", "L5mn", "IV", "IS"),
    leba_F4_18 = c("LLiT10", "LLiT250", "L5mde", "IV", "IS"),
    leba_F5_19 = c("IV", "IS", "L5mn", "L5mde", "M10m"),
    leba_F5_20 = c("LE", "SCmelEDI", "MPratio"),
    leba_F5_21 = c("LE"),
    leba_F5_22 = c("SCmelEDI", "FLiT10", "FLiT250"),
    leba_F5_23 = c("SCmelEDI", "FLiT10", "FLiT250")
  )

#transform relevant metrics to a table of metrics
metric_selectionH5 <-
metric_selection$H5 %>%
  enframe() %>%
  unnest(value) %>%
  mutate(hypothesised = TRUE)

5.4.1 Hypothesis 5

\(H5\): LEBA items correlate with preselected light-logger derived light exposure variables.

Due to how the LEBA questions were phrased, different timespan comparisons are needed. For Malaysia, only the last instance is used and average metric values are calculated from daily data. For Switzerland, respective time periods are compared to one another, i.e. a LEBA period of 4 days is compared to the light exposure metrics from the same 4 days, and averaged over those days for a one-to-one comparison.

#collect the light exposure data for malaysia
metrics_H5_malaysia <- 
  metricsRQ3 %>% 
  filter(site == "malaysia") %>% 
  unnest(data) %>% 
  select(-site) %>% 
  filter(metric != "mean")

#average values across all days
metrics_H5_malaysia <- 
  metrics_H5_malaysia %>% 
  rowwise() %>% 
  mutate(data = list(data %>% 
                       group_by(Id) %>% 
                       summarize(value = mean(value, na.rm = TRUE), .groups = "drop")
  )
  )

#collect the appropriate day for the LEBA data
leba_H5_malaysia <-
  leba_data_combined %>% filter(site == "malaysia", Day ==31) %>% 
  select(-Datetime, -site, -Day) %>% 
    mutate(across(-Id, as.numeric)) %>% 
  pivot_longer(cols = -Id, names_to = "leba", values_to = "leba_values")

#join the data
leba_light_H5_malaysia <-
metrics_H5_malaysia %>% 
  unnest(data) %>% 
  rename(metric_value = value) %>%
  left_join(leba_H5_malaysia, by = "Id",
            relationship = "many-to-many") %>% 
  select(-Id) %>% 
  drop_na()

#calculate correlations
corr_malaysia <-
  leba_light_H5_malaysia %>% 
    group_by(metric, leba) %>% 
    summarize(
      correlation = cor(metric_value, leba_values, method = "spearman"),
              p_value = cor.test(metric_value, leba_values)$p.value,
              p_value_adjusted =p.adjust(p_value, method = "fdr", 
                                         n = p_adjustment$H5),
              significant = p_value <= 0.05,
              significant_adjusted = p_value_adjusted <= 0.05,
              .groups = "drop") %>%
  mutate(      
    question = leba_questions[leba],
    factor = leba_factors[leba],
    question = ifelse(is.na(question), factor, question),
    .before = 1) %>% 
  fill(factor) %>% 
  left_join(metric_selectionH5, by = c("metric" = "value", "leba" = "name")) %>% 
  mutate(hypothesised = ifelse(is.na(hypothesised), FALSE, hypothesised),
         coloring = case_when(
           significant_adjusted & hypothesised ~ "Hypothesised and significant",
           significant & hypothesised ~ "Hypothesised and (unadjusted) significant",
           significant & !hypothesised ~ "(unadjusted) Significant",
           !significant_adjusted & hypothesised ~ "Hypothesised",
           TRUE ~ "None"
         ) %>% factor(levels = c("Hypothesised and significant", "Hypothesised and (unadjusted) significant", "(unadjusted) Significant", "Hypothesised", "None")
         )
         ) %>% 
    mutate(
    leba = leba %>% 
        str_remove("leba_") %>% 
        str_replace("_", ", ") %>% 
        str_replace("(F[:digit:])$", "\\1 (overall)")
  )

colors_H5 <- c("gold", "orange", "white", "red", "grey")


#display the results
corr_plot_malaysia <- 
corr_malaysia %>% 
  ggplot(aes(x = metric, y = (leba %>% factor() %>% fct_rev()), fill = abs(correlation), col = coloring)) +
  geom_blank()+
  geom_tile(data = corr_malaysia %>% filter(coloring != "None"), linewidth = 0.25) + 
  geom_text(
    aes(label = 
          paste(
            paste("c=",round(correlation, 2)), 
            p_value %>% style_pvalue(prepend_p = TRUE), 
            sep = "\n")
        ),
    fontface = "bold",
    size = 1.8) +
  coord_fixed() +
  scale_fill_viridis_c(limits = c(0,1))+
  scale_color_manual(values = colors_H5, drop = FALSE)+
  labs(y = "LEBA questions and factors", x = "Metrics", color = "Relevance",
       fill = "(abs) Correlation", subtitle = "Malaysia")+
  theme(axis.text.x = element_text(angle = 90, hjust = 1),
        plot.subtitle = element_text(color = pal_jco()(1)))

corr_plot_malaysia

To combine relevant metrics with LEBA questionnaires, Metric days will be associated with days of LEBA questionnaires and previous days up until the previous collection point.

#collect the light exposure data for Switzerland
metrics_H5_switzerland <- 
  metricsRQ3 %>% 
  filter(site == "switzerland") %>% 
  unnest(data) %>% 
  select(-site) %>% 
  filter(metric != "mean") %>% 
  filter(metric != "IS")
  #we have to remove IS for now, as it has to be computed for each section for Switzerland

#collect the appropriate day for the LEBA data
leba_H5_switzerland <-
  leba_data_combined %>% filter(site == "switzerland") %>% 
  mutate(
    leba_col_period = Day,
    Day = date(Datetime)) %>% 
  select(-Datetime, -site) %>% 
    mutate(across(-c(Id, Day, leba_col_period), as.numeric))

#average values across collection days
metrics_H5_switzerland <- 
metrics_H5_switzerland %>% 
  rowwise() %>% 
  mutate(data = list(
    data %>% 
  full_join(leba_H5_switzerland, by = c("Id", "Day")) %>% 
  arrange(Id, Day) %>% 
  fill(starts_with("leba_"), .direction = "up") %>% 
  drop_na(value) %>% 
  rename(col_period = leba_col_period) %>% 
  pivot_longer(cols = starts_with("leba_"), names_to = "leba", values_to = "leba_values") %>% 
  group_by(Id, col_period) %>% 
  summarize(
    value = mean(value, na.rm = TRUE), .groups = "drop"
  )
  ))

#calculating IS for Switzerland for each section
data_collection_switzerland <- 
leba_H5_switzerland %>% 
  group_by(Id,  Day, leba_col_period) %>% 
  summarize(.groups = "drop") 

metrics_H5_switzerland_IS <- 
data$switzerland %>% 
  full_join(data_collection_switzerland, by = c("Id", "Day")) %>% 
  arrange(Id, Day) %>% 
  fill(starts_with("leba_"), .direction = "up") %>% 
  group_by(Id, leba_col_period) %>% 
  summarize(
    IS = interdaily_stability(MEDI, Datetime, na.rm = TRUE),
    .groups = "drop"
  ) %>% 
  rename(col_period = leba_col_period) %>%
  pivot_longer(cols = IS, names_to = "metric", values_to = "value") %>% 
  drop_na(value) %>% 
  nest(data = -metric)
Warning: There were 61 warnings in `summarize()`.
The first warning was:
ℹ In argument: `IS = interdaily_stability(MEDI, Datetime, na.rm = TRUE)`.
ℹ In group 3: `Id = ID01` and `leba_col_period = 11`.
Caused by warning in `interdaily_stability()`:
! One or more days in the data do not consist of 24 h.
Interdaily stability might not be meaningful for non-24 h days.
These days contain less than 24 h:
[1] NA
ℹ Run `dplyr::last_dplyr_warnings()` to see the 60 remaining warnings.
metrics_H5_switzerland <- 
  rbind(metrics_H5_switzerland,
        metrics_H5_switzerland_IS)

#transform the LEBA data to the appropriate format
leba_H5_switzerland <-
  leba_H5_switzerland %>% 
  pivot_longer(cols = -c(Id, Day, leba_col_period), 
               names_to = "leba", values_to = "leba_values")

#join the data
leba_light_H5_switzerland <-
metrics_H5_switzerland %>% 
  unnest(data) %>% 
  rename(metric_value = value) %>%
  left_join(leba_H5_switzerland, by = c("Id", "col_period" = "leba_col_period"),
            relationship = "many-to-many") %>% 
  select(-Id) %>% 
  drop_na()


#calculate correlations
corr_switzerland <-
  leba_light_H5_switzerland %>% 
    group_by(metric, leba) %>% 
    summarize(
      correlation = cor(metric_value, leba_values, method = "spearman"),
              p_value = cor.test(metric_value, leba_values)$p.value,
              p_value_adjusted =p.adjust(p_value, method = "fdr", 
                                         n = p_adjustment$H5),
              significant = p_value <= 0.05,
              significant_adjusted = p_value_adjusted <= 0.05,
              .groups = "drop") %>%
  mutate(      
    question = leba_questions[leba],
    factor = leba_factors[leba],
    question = ifelse(is.na(question), factor, question),
    .before = 1) %>% 
  fill(factor) %>% 
  left_join(metric_selectionH5, by = c("metric" = "value", "leba" = "name")) %>% 
  mutate(hypothesised = ifelse(is.na(hypothesised), FALSE, hypothesised),
         coloring = case_when(
           significant_adjusted & hypothesised ~ "Hypothesised and significant",
           significant & hypothesised ~ "Hypothesised and (unadjusted) significant",
           significant & !hypothesised ~ "(unadjusted) Significant",
           !significant_adjusted & hypothesised ~ "Hypothesised",
           TRUE ~ "None"
         ) %>% factor(levels = c("Hypothesised and significant", "Hypothesised and (unadjusted) significant", "(unadjusted) Significant", "Hypothesised", "None")
         )) %>% 
      mutate(
    leba = leba %>% 
        str_remove("leba_") %>% 
        str_replace("_", ", ") %>% 
        str_replace("(F[:digit:])$", "\\1 (overall)")
  )


#display the results
corr_plot_switzerland <- 
corr_switzerland %>% 
  ggplot(aes(x = metric, y = (leba %>% factor() %>% fct_rev()), fill = abs(correlation), col = coloring)) +
  geom_tile(data = corr_switzerland %>% filter(coloring != "None"), linewidth = 0.25) + 
  geom_text(
    aes(label = 
          paste(
            paste("c=",round(correlation, 2)), 
            p_value %>% style_pvalue(prepend_p = TRUE), 
            sep = "\n")
        ),
    size = 1.8,
    fontface = "bold") +
  coord_fixed() +
  scale_fill_viridis_c(limits = c(0,1))+
  scale_color_manual(values = colors_H5, drop = FALSE)+
  labs(y = "LEBA questions and factors", x = "Metrics", color = "Relevance",
       fill = "(abs) Correlation", subtitle = "Switzerland")+
  theme(axis.text.x = element_text(angle = 90, hjust = 1),
        plot.subtitle = element_text(color = pal_jco()(2)[2]))

corr_plot_switzerland

#visualization
(corr_plot_malaysia + theme(legend.position = "none")) + corr_plot_switzerland + plot_layout(guides = "collect")

corr_results <- list_rbind(
  list(malaysia = corr_malaysia, 
       switzerland = corr_switzerland), 
  names_to = "site")

#summary of restuls
corr_results %>% 
  select(site, coloring) %>% 
  tbl_summary(by = site,
              label = list(coloring ~ "Summary of correlation matrices")) %>% 
  as_gt() %>%
  gt::tab_header(title = "Results for Hypothesis 5")
Results for Hypothesis 5
Characteristic malaysia
N = 5321
switzerland
N = 5321
Summary of correlation matrices

    Hypothesised and significant 0 (0%) 10 (1.9%)
    Hypothesised and (unadjusted) significant 0 (0%) 19 (3.6%)
    (unadjusted) Significant 14 (2.6%) 150 (28%)
    Hypothesised 84 (16%) 55 (10%)
    None 434 (82%) 298 (56%)
1 n (%)
corr_results %>% group_by(site, coloring) %>% summarize(n = n(), mean_cor = mean(abs(correlation)), max_cor = max(abs(correlation))) %>% 
  gt() %>% 
  tab_header("Summary of correlation coefficients by type, Hypothesis 5") %>% 
  tab_footnote(
    "Mean correlation is the average of the absolute correlation coefficients",
    locations = cells_column_labels(columns = "mean_cor")
  ) %>% 
  tab_footnote(
    "Max correlation is the highest absolute correlation coefficient",
    locations = cells_column_labels(columns = "max_cor")
  ) %>% 
#make col labels and row labels bold
  tab_style(
    style = list(
      cell_text(weight = "bold")
    ),
    locations = list(cells_column_labels(), cells_row_groups())
  ) %>% 
  cols_label(
    coloring = "",
    mean_cor = "Mean correlation",
    max_cor = "Max correlation"
  ) %>% 
  fmt_number(
    columns = c(mean_cor, max_cor),
    decimals = 2
  ) %>% 
  text_transform(str_to_title, locations = cells_row_groups()) %>% 
  cols_align(align = "left", columns = coloring)
`summarise()` has grouped output by 'site'. You can override using the
`.groups` argument.
Summary of correlation coefficients by type, Hypothesis 5
n Mean correlation1 Max correlation2
Malaysia
(unadjusted) Significant 14 0.44 0.64
Hypothesised 84 0.19 0.51
None 434 0.17 0.60
Switzerland
Hypothesised and significant 10 0.39 0.48
Hypothesised and (unadjusted) significant 19 0.23 0.38
(unadjusted) Significant 150 0.21 0.44
Hypothesised 55 0.07 0.28
None 298 0.09 0.30
1 Mean correlation is the average of the absolute correlation coefficients
2 Max correlation is the highest absolute correlation coefficient
interpretation <- 
  c("The more often 30 minutes or less are spent outside, the shorter the period above 1000 lx",
    "The more often >3 hours per day are spent outside, the longer the period above 1000 lx",
    "The more often > 3 hours per day are spent outside, the higher the time above 1000 lx",
    "The more spend time outside as possible, the brighter the brightest 10 hours (mean)",
    "The more spend time outside as possible, the longer the period above 1000 lx",
    "The more spend time outside as possible, the higher the time above 1000 lx",
    "The more spend time outside as possible, the higher the time above 250 lx",
    "The more often smartwatches are looked at within 1h before attempting to fall asleep, the later the last time above 250 lx",
    "The more an alarm with a dawn simulation light is used, the warmer the light colour temperature",
    "The more often the lights are turned on immediately after waking up, the earlier the first time above 250 lx"
    )

corr_results %>% filter(significant_adjusted) %>% 
  select(-c(p_value, significant, significant_adjusted, factor, coloring)) %>% 
  mutate(across(starts_with("p_value"), style_pvalue),
         correlation = correlation %>% round(2)) %>% 
  arrange(leba) %>%
  filter(hypothesised) %>% 
  gt() %>% 
  cols_hide(hypothesised) %>% 
  tab_header("Summary of significant results for Hypothesis 5") %>% 
    tab_footnote(
      paste("p-values are adjusted for multiple comparisons using the false-discovery-rate for n=", p_adjustment$H5, "comparisons"),
      locations = cells_column_labels(columns = "p_value_adjusted")
    ) %>%
    tab_footnote(
      "Indication whether this specific combination was hypothesised in the preregistration document",
      locations = cells_column_labels(columns = "hypothesised")
    ) %>%
    tab_footnote(
      "Spearman correlation coefficient between the metric and the LEBA question/factor",
      locations = cells_column_labels(columns = "correlation")
    ) %>%
  cols_label(
    site = "Site",
    question = "Question/Factor",
    leba = "LEBA item",
    correlation = "Correlation coefficient",
    p_value_adjusted = "p-value",
    metric = "Metric"
  ) %>% 
  cols_add(Interpretation = interpretation) %>% 
  fmt(columns = leba, fns = \(x) x %>% str_remove("leba_") %>% 
        str_replace("_", ", ")) %>% 
  gtsave("figures/Table_1.png", vwidth = 1100)
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b261013f98d.html screenshot completed

The results clearly show that there are strong correlations between certain light exposure metrics and answers to the LEBA questionnaire, but only in the Swiss dataset. After adjustment for multiple comparisons, none of the correlations in the Malaysian dataset remained significant. Interestingly, the average (absolute) correlation is stronger in the Malaysia site compared to the Switzerland site, indicating a higher variance in either light exposure metrics or LEBA answers in the Malaysia dataset, compared to Switzerland. For the Swiss dataset, 10 of the hypothesised correlations proved to be significant, with on average medium effect size (Cohen 1988, 1992). For exploratory reasons, unadjusted significance levels are left in the matrices.

5.4.2 Hypothesis 6

\(H6\): There is a difference between Malaysia and Switzerland on how well light-logger derived light exposure variables correlate with subjective LEBA items.

The preregistration document specifies a generalized mixed model approach to test this hypothesis. As the speficied formula does not lead to to the stated number of models, i.e. 23+5, and also does not provide an overall overview of whether selected light exposure metrics correlate with LEBA items and factors, we provide an alternative approach by modelling the dependency of correlation coefficients on site, which gives a more condensed output and uses a linear model. The stricter analysis is still included in the analysis documentation.

leba_light_H6 <- 
corr_results %>% 
  filter(hypothesised) %>%
  nest(data = -c(leba, question, factor)) %>% 
  arrange(leba)

#hypothesis formulas
formula_H6 <- correlation ~ site
formula_H0 <- correlation ~ 1

inference_H6 <- 
  leba_light_H6 %>% 
  mutate(formula_H1 = c(formula_H6),
         formula_H0 = c(formula_H0),
         type = "lm",
         family = list(gaussian()),
         metric = leba
         )


inference_H6 <- 
inference_H6 %>%
  rowwise() %>% 
  mutate(model = list(lm(data = data, formula = formula_H1)),
         model0 = list(lm(data = data, formula = formula_H0)),
         comparison = list(anova(model, model0)),
        p.value_adj = comparison %>% tidy() %>% pull(p.value) %>% .[2] %>%
        p.adjust(method = "fdr", n = p_adjustment$H6),
      significant = p.value_adj <= 0.05,
      summary = list(
        switch(significant+1,
               tidy(model0),
               tidy(model)))
  ) %>% 
  drop_na() %>% 
  rowwise() %>% 
  mutate(
    table = list(
        tibble(
          metric = metric,
          p.value =
            p.value_adj %>%
            style_pvalue() %>%
            {ifelse(significant, paste0("**", ., "**"),.)},
          summary %>%
            mutate(estimate = estimate) %>%
            select(term, estimate) %>%
            mutate(term = str_remove_all(term, "\\(|\\)|site")) %>% 
            pivot_wider(names_from = term, values_from = estimate) 
        ) %>%
          mutate(malaysia = if(exists("switzerland")) 0 else NA, .after = Intercept)
    )
  )

Inference_Table_H6 <- 
inference_H6 %>% 
  mutate(metric = leba) %>% 
Inference_Table(p_adjustment = p_adjustment$H6, value = correlation) %>%
  fmt_number("Intercept", decimals = 2) %>%
  fmt_number("switzerland", decimals = 2) %>%
  cols_align(align = "center", columns = "p.value") %>%
  tab_header(title = "Model Results for Hypothesis 6")

Inference_Table_H6
Model Results for Hypothesis 6
p-value1 Intercept
Site coefficients
Malaysia Switzerland
F2, 04 0.2 −0.07
F2, 05


0.9


0.9

−0.07
F2, 06 0.054 0.03
F2, 07 0.4 0.06
F2, 08


0.9


0.9

0.16
F2, 09


0.9


0.9

−0.08
F3, 10


0.9


0.9

0.03
F3, 11


0.9


0.9

0.05
F3, 12


0.9


0.9

0.00
F3, 13


0.9


0.9

0.01
F3, 14


0.9


0.9

−0.08
F4, 15 0.019 0.25 0 −0.40
F4, 16


0.9


0.9

0.09
F4, 17


0.9


0.9

−0.16
F4, 18 0.035 0.25 0 −0.26
F5, 19


0.9


0.9

−0.10
F5, 20


0.9


0.9

0.21
F5, 22


0.9


0.9

−0.20
F5, 23


0.9


0.9

−0.07
1 p-values are adjusted for multiple comparisons using the false-discovery-rate for n= 28 comparisons
Inference_Table_H6 %>% gtsave("figures/Table_H6.png")
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b265c8a51f3.html screenshot completed
v1 <- inference_H6 %>% filter(significant) %>% pull(question)
#join the data
leba_light_H6_malaysia <-
metrics_H5_malaysia %>% 
  unnest(data) %>% 
  rename(metric_value = value) %>%
  left_join(leba_H5_malaysia, by = "Id",
            relationship = "many-to-many") %>% 
  drop_na()

#join the data
leba_light_H6_switzerland <-
metrics_H5_switzerland %>% 
  unnest(data) %>% 
  rename(metric_value = value) %>%
  left_join(leba_H5_switzerland, by = c("Id", "col_period" = "leba_col_period"),
            relationship = "many-to-many") %>% 
  drop_na()

#create a combined dataset with metrics and LEBA items
light_leba_combined <-
  list_rbind(
  list(malaysia = leba_light_H6_malaysia, 
       switzerland = leba_light_H6_switzerland ), 
  names_to = "site") %>% 
  select(-col_period, -Day) %>% 
  nest(data = -c(metric, leba)) %>% 
  left_join(metric_selectionH5, by = c("metric" = "value", "leba" = "name")) %>% 
  filter(hypothesised) %>% 
  select(-hypothesised)

#hypothesis formulas
formula_H6 <- metric_value ~ site*leba_values + (1|site:Id)
formula_H0 <- metric_value ~ leba_values + (1|site:Id)

lmer <- lme4::lmer

inference_H6.2 <- 
  light_leba_combined %>% 
  mutate(formula_H1 = c(formula_H6),
         formula_H0 = c(formula_H0),
         type = "lmer",
         family = list(gaussian())
         )

inference_H6.2 <-
  inference_H6.2 %>% 
  inference_summary2(p_adjustment = p_adjustment$H6)
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
boundary (singular) fit: see help('isSingular')
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
inference_H6.2 %>% 
  rowwise() %>% 
  mutate(metric_leba = paste(metric, leba %>% str_remove("leba_") %>% 
                               str_replace("_", ", "), sep = " & "),
         table = list(table %>% mutate(metric = metric_leba))) %>% 
Inference_Table(p_adjustment = p_adjustment$H6, value = metric_value) %>%
  fmt_number("Intercept", decimals = 2) %>%
  fmt_number("switzerland", decimals = 2) %>%
  cols_align(align = "center", columns = "p.value") %>%
  tab_header(title = "Model Results for Hypothesis 6")
Model Results for Hypothesis 6
p-value1 Intercept
Site coefficients
leba_values switzerland:leba_values
Malaysia Switzerland
TAT250 & F2, 04 0.072 12,390.09 -9.040267e+02
TAT250 & F2, 05 0.052 9,998.34 9.516650e+00
TAT250 & F2, 06 0.027 7,221.81 0 3,344.20 -4.665635e+02 1.056576e+03
TAT250 & F2, 07 0.004 6,932.53 0 1,784.68 -4.005614e+02 1.889049e+03
TAT250 & F2, 08 0.041 6,142.27 0 2,323.64 -1.594537e+02 1.509960e+03
TAT1000 & F2, 04


0.9


0.9

7,535.08 -9.475831e+02
TAT1000 & F2, 05 0.5 5,638.13 -1.709208e+02
TAT1000 & F2, 06 0.2 3,547.15 5.489428e+02
TAT1000 & F2, 07 0.004 2,827.73 0 −656.85 -4.446896e+01 1.748464e+03
TAT1000 & F2, 08 0.3 1,404.46 1.433454e+03
PAT1000 & F2, 04


0.9


0.9

2,309.81 -3.297124e+02
PAT1000 & F2, 05


0.9


0.9

1,869.17 -1.275169e+02
PAT1000 & F2, 06


0.9


0.9

1,009.14 1.712798e+02
PAT1000 & F2, 07 0.088 532.91 3.966336e+02
PAT1000 & F2, 08


0.9


0.9

380.29 4.272156e+02
M10m & F2, 04


0.9


0.9

2.89 -1.007520e-01
M10m & F2, 05


0.9


0.9

2.61 3.213340e-03
M10m & F2, 06 0.7 2.41 7.433187e-02
M10m & F2, 07 0.3 2.37 9.893745e-02
M10m & F2, 08


0.9


0.9

2.34 1.095992e-01
M10m & F5, 19


0.9


0.9

2.62 3.201022e-03
L5m & F3, 12 0.9 −0.91 7.514914e-03
L5m & F3, 14 0.6 −0.88 -1.102175e-02
L5m & F4, 17 0.12 −0.86 -9.640222e-03
IV & F2, 04


0.9


0.9

1.24 1.583709e-02
IV & F2, 05


0.9


0.9

1.26 5.229081e-03
IV & F2, 06


0.9


0.9

1.23 1.614837e-02
IV & F2, 07


0.9


0.9

1.36 -3.397813e-02
IV & F2, 08


0.9


0.9

1.37 -3.383786e-02
IV & F2, 09


0.9


0.9

1.27 2.673268e-03
IV & F3, 10


0.9


0.9

1.09 4.272856e-02
IV & F3, 11


0.9


0.9

1.16 3.001735e-02
IV & F3, 12


0.9


0.9

1.27 2.473250e-03
IV & F3, 13


0.9


0.9

1.30 -1.906535e-02
IV & F3, 14


0.9


0.9

1.32 -3.611345e-02
IV & F4, 15


0.9


0.9

1.25 7.252447e-03
IV & F4, 16


0.9


0.9

1.26 5.975872e-03
IV & F4, 17


0.9


0.9

1.36 -2.154528e-02
IV & F4, 18


0.9


0.9

1.24 1.627129e-02
IV & F5, 19


0.9


0.9

1.28 -3.383899e-03
LLiT10 & F3, 10 0.7 78,646.06 5.040035e+02
LLiT10 & F3, 13


0.9


0.9

82,018.40 -8.970010e+02
LLiT10 & F4, 15 0.9 80,534.01 1.082976e+02
LLiT10 & F4, 16 0.7 80,945.17 -3.792629e+01
LLiT10 & F4, 18


0.9


0.9

80,181.25 2.977942e+02
LLiT250 & F3, 10 0.2 64,988.30 5.360724e+02
LLiT250 & F3, 13 0.053 64,784.78 2.141667e+03
LLiT250 & F4, 15 0.091 67,075.66 8.623077e+01
LLiT250 & F4, 16 0.15 67,009.60 1.767484e+02
LLiT250 & F4, 18 0.086 66,287.32 4.608893e+02
IS & F2, 04 <0.001 0.17 0 0.32 -1.583543e-03 -4.067143e-03
IS & F2, 05 <0.001 0.19 0 0.30 -7.459296e-03 6.023305e-03
IS & F2, 06 <0.001 0.16 0 0.33 4.941166e-04 -3.544291e-03
IS & F2, 07 <0.001 0.19 0 0.29 -8.181988e-03 9.758857e-03
IS & F2, 08 <0.001 0.14 0 0.35 8.505965e-03 -1.334932e-02
IS & F2, 09 <0.001 0.17 0 0.32 -7.340919e-03 2.355593e-04
IS & F3, 10 <0.001 0.20 0 0.36 -8.797350e-03 -9.571046e-03
IS & F3, 11 <0.001 0.17 0 0.27 -1.764296e-03 1.045223e-02
IS & F3, 12 <0.001 0.21 0 0.26 -1.418374e-02 2.063116e-02
IS & F3, 13 <0.001 0.10 0 0.35 5.253330e-02 -3.444418e-02
IS & F3, 14 <0.001 0.20 0 0.25 -3.667408e-02 5.822572e-02
IS & F4, 15 <0.001 0.14 0 0.34 5.952810e-03 -7.072971e-03
IS & F4, 16 <0.001 0.14 0 0.33 1.058658e-02 -8.936431e-03
IS & F4, 17 <0.001 0.20 0 0.38 -1.149060e-02 -1.385521e-02
IS & F4, 18 <0.001 0.14 0 0.33 7.802745e-03 -3.096007e-03
IS & F5, 19 <0.001 0.18 0 0.28 -1.093759e-02 2.187314e-02
FLiT1000 & F2, 09


0.9


0.9

38,694.30 -7.638506e+02
FLiT250 & F3, 11


0.9


0.9

39,358.24 -1.185375e+03
FLiT250 & F5, 22


0.9


0.9

34,991.97 -2.698562e+02
FLiT250 & F5, 23 0.7 38,393.32 -1.594819e+03
MPratio & F5, 20


0.9


0.9

0.93 5.388447e-04
LE & F5, 20 0.4 14,306.95 1.400444e+03
LE & F5, 21


0.9


0.9

20,285.18 -2.631760e+03
SCmelEDI & F5, 20 0.012 0.00 0 0.00 3.372993e-06 1.016210e-05
SCmelEDI & F5, 22 0.001 0.00 0 0.00 -5.273807e-05 1.017089e-05
SCmelEDI & F5, 23 0.012 0.00 0 0.00 1.745354e-05 -8.880466e-06
L5mde & F4, 15


0.9


0.9

0.96 -2.695647e-02
L5mde & F4, 16


0.9


0.9

0.79 4.151845e-02
L5mde & F4, 18


0.9


0.9

0.72 6.858306e-02
L5mde & F5, 19


0.9


0.9

0.82 3.718367e-02
L5mn & F3, 10 0.016 0.16 0 −1.10 -1.175465e-01 1.471177e-01
L5mn & F3, 13 0.059 −0.56 -4.882849e-02
L5mn & F4, 17 0.052 −0.53 -2.436320e-02
L5mn & F5, 19 0.067 −0.62 -1.217793e-03
1 p-values are adjusted for multiple comparisons using the false-discovery-rate for n= 28 comparisons

Overall, correlations are not significantly different between Malaysia and Switzerland. Exceptions are the following questions:

  • I dim my mobile phone screen within 1 hour before attempting to fall asleep

  • I dim my computer screen within 1 hour before attempting to fall asleep

6 Figure and Table generation

In this section, figures for the analysis are produced.

#to save or use the analysis up to this point
# save.image("intermediary_environment/image.RData")
# load("intermediary_environment/image.RData")

6.1 PSQI

In this section, we calculate PSQI scores for both sites

psqi_labels <- c(psqi_1 = "Subjective sleep quality",
                 psqi_2 = "Sleep latency",
                 psqi_3 = "Sleep duration",
                 psqi_4 = "Habitual sleep efficiency",
                 psqi_5 = "Sleep disturbance",
                 psqi_6 = "Use of sleep medication",
                 psqi_7 ="Daytime dysfunction",
                 psqi_8 = "Global PSQI")

#Malaysia
path <- "data/Malaysia/psqi_malaysia_corrected.csv"

psqi_factors <- list(
  type1 = c("Not during the past month", "Less than once a week", "Once or twice a week", "Three or more times a week"),
  type2 = c("No problem at all", "Only a very slight problem", "Somewhat of a problem", "A very big problem"),
  type3 = c("Very good", "Fairly good", "Fairly bad", "Very bad"),
  type4 = c("No bed partner or room mate", "Partner/room mate in other room", "Partner in same room but not same bed", "Partner in same bed"),
  type5 = c("Not during the past month", "Less than once a week", "Once or twice a week", "Three ore more times a week")
)

psqi_malaysia <- 
  read_csv(path, id = "file.path") %>% 
  mutate(across(starts_with(c("psqi_05", "psqi_06", "psqi_07")), \(x) factor(x, psqi_factors$type1, 0:3)),
         psqi_08 = factor(psqi_08, psqi_factors$type2, 0:3),
         psqi_09 = factor(psqi_09, psqi_factors$type3, 0:3),
         psqi_10 = factor(psqi_10, psqi_factors$type4, 0:3),
         across(starts_with("psqi_11"), \(x) factor(x, psqi_factors$type5, 0:3)),
         across(psqi_05_1:psqi_11_5, \(x) as.numeric(x)-1)
         )
Rows: 38 Columns: 33
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (27): Id, psqi_01, psqi_02, psqi_03, psqi_04, psqi_05_1, psqi_05_2, psq...
dbl   (3): Day, psqi_02_corrected, psqi_04_corrected
time  (2): psqi_01_corrected, psqi_03_corrected

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
psqi_malaysia_results <- 
psqi_malaysia %>% 
  group_by(Id, Day) %>% 
  mutate(psqi_1 = psqi_09, .keep = "none",
         psqi_2 = case_when(
           psqi_02_corrected <= 15 ~ 0,
           psqi_02_corrected <= 30 ~ 1,
           psqi_02_corrected <= 60 ~ 2,
           psqi_02_corrected > 60 ~ 3,
           .default = NA),
         psqi_2 = ((psqi_2 + psqi_05_1)-1) %/% 2 + 1,
         psqi_3 = case_when(
           psqi_04_corrected > 7 ~ 0,
           psqi_04_corrected >= 6 ~ 1,
           psqi_04_corrected >= 5 ~ 2,
           psqi_04_corrected < 5 ~ 3,
           .default = NA
         ),
         psqi_4 = (psqi_03_corrected-psqi_01_corrected)/3600,
         psqi_4 = if_else(psqi_03_corrected < psqi_01_corrected, psqi_4 + 24, psqi_4),
         psqi_4 = (psqi_04_corrected / as.numeric(psqi_4))*100,
         psqi_4 = case_when(
           psqi_4 >= 85 ~ 0,
           psqi_4 >= 75 ~ 1,
           psqi_4 >= 65 ~ 2,
           psqi_4 < 65 ~ 3,
           .default = NA
         ),
         psqi_5 = (sum(pick(psqi_05_2:psqi_05_10), na.rm = TRUE) - 1) %/% 9 + 1,
         psqi_6 = psqi_06,
         psqi_7 = ((psqi_07 + psqi_08)-1) %/% 2 + 1,
         psqi_8 = sum(pick(psqi_1:psqi_7)),
         across(psqi_1:psqi_7, \(x) factor(x, levels = c(0,1,2,3)))
         )

#set variable labels
var_label(psqi_malaysia_results) <- psqi_labels %>% as.list()

psqi_malaysia_result_table <- 
psqi_malaysia_results %>% 
  tbl_summary(by = Day, include = -Id,
              missing_text = "missing",
              statistic = list(psqi_8 ~ "{median} ({min},{max})")) %>% 
  gtsummary::add_p() %>% 
  add_significance_stars()
The following errors were returned during `add_p()`:
✖ For variable `psqi_1` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_2` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_3` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_4` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_5` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_6` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_7` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_8` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
psqi_malaysia_result_table %>% 
  as_gt() %>% 
  gtsave("figures/table_psqi_malaysia_results.png")
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b265d505fd9.html screenshot completed
#Switzerland

#Import the data
leba_folders <- "data/Basel/REDCap"
leba_file <- "REDCap_CajochenASEAN_DATA_2023-10-30_1215.csv"
leba_files <- paste(leba_folders, leba_file, sep = "/")

psqi_switzerland <- 
  read_csv(leba_files, id = "file.path") %>% 
  mutate(Day = parse_number(redcap_event_name)) %>% 
  fill(code) %>% 
  select(code, Day, contains("psqi")) %>% 
  filter(Day %in% c(1,31)) %>% 
  group_by(code, Day) %>% 
  fill(everything(),.direction = "up") %>% 
  reframe(across(everything(), first))
Rows: 235 Columns: 142
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr    (4): redcap_event_name, code, leba_weekly_timestamp, psqi_q5j_other_d...
dbl  (125): record_id, teilnahme, registrierung_fr_studie_complete, leba_f1_...
lgl    (7): registrierung_fr_studie_timestamp, screening_scores_leba_weekly_...
dttm   (2): leba_end_timestamp, psqi_timestamp
time   (3): psqi_q1_bedtime, psqi_q3_getuptime, psqi_q4_sleephrs

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
psqi_switzerland_results <- 
psqi_switzerland %>% 
  group_by(code, Day) %>% 
  mutate(psqi_1 = psqi_q6_sleepquality, .keep = "none",
         psqi_2 = case_when(
           psqi_q2_sleeptime <= 15 ~ 0,
           psqi_q2_sleeptime <= 30 ~ 1,
           psqi_q2_sleeptime <= 60 ~ 2,
           psqi_q2_sleeptime > 60 ~ 3,
           .default = NA),
         psqi_2 = ((psqi_2 + psqi_q5a_notfallasleep)-1) %/% 2 + 1,
         psqi_3 = case_when(
           psqi_q4_sleephrs/3600 > 7 ~ 0,
           psqi_q4_sleephrs/3600 >= 6 ~ 1,
           psqi_q4_sleephrs/3600 >= 5 ~ 2,
           psqi_q4_sleephrs/3600 < 5 ~ 3,
           .default = NA
         ),
         psqi_4 = (psqi_q3_getuptime-psqi_q1_bedtime)/3600,
         psqi_4 = if_else(psqi_q3_getuptime < psqi_q1_bedtime, psqi_4 + 24, psqi_4),
         psqi_4 = (psqi_q4_sleephrs/3600 / as.numeric(psqi_4))*100,
         psqi_4 = case_when(
           psqi_4 >= 85 ~ 0,
           psqi_4 >= 75 ~ 1,
           psqi_4 >= 65 ~ 2,
           psqi_4 < 65 ~ 3,
           .default = NA
         ),
         psqi_5 = (sum(pick(psqi_q5b_earlyawake:psqi_q5j_other)) - 1) %/% 9 + 1,
         psqi_6 = psqi_q7_sleepmed,
         psqi_7 = ((psqi_q8_staywake + psqi_q9_dailytasks)-1) %/% 2 + 1,
         psqi_8 = sum(pick(psqi_1:psqi_7)),
         across(psqi_1:psqi_7, \(x) factor(x, levels = c(0,1,2,3)))
         )

#set variable labels
var_label(psqi_switzerland_results) <- psqi_labels %>% as.list()

psqi_switzerland_result_table <- 
psqi_switzerland_results %>% 
  tbl_summary(by = Day, include = -code,
              missing_text = "missing",
              statistic = list(psqi_8 ~ "{median} ({min},{max})")) %>% 
  gtsummary::add_p() %>% 
  add_significance_stars()
The following errors were returned during `add_p()`:
✖ For variable `psqi_1` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_2` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_3` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_4` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_5` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_6` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_7` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_8` (`Day`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
psqi_switzerland_result_table %>% 
  as_gt() %>% 
  gtsave("figures/table_psqi_switzerland_results.png")
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b26442839bf.html screenshot completed
#combined results
psqi_data <- 
list(
  Malaysia = psqi_malaysia_results %>% ungroup() %>% filter(Day == 31) %>% select(-Id, -Day),
  Switzerland = psqi_switzerland_results %>% ungroup() %>% filter(Day == 31) %>% select(-code, -Day)
) %>% 
  list_rbind(names_to = "site")

var_label(psqi_data) <- psqi_labels %>% as.list()

psqi_data %>% 
  tbl_summary(by = site,
              missing_text = "missing",
              type = list(psqi_8 ~ "continuous"),
              statistic = list(psqi_8 ~ "{median} ({min},{max})")) %>% 
  gtsummary::add_p() %>% 
  add_significance_stars() %>% 
  as_gt() %>% 
  gtsave("figures/table_psqi_Day31_results.png")
The following errors were returned during `as_gt()`:
✖ For variable `psqi_1` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_2` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_3` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_4` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_5` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_6` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_7` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_8` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b2667ee3fc2.html screenshot completed
psqi_data <- 
list(
  Malaysia = psqi_malaysia_results %>% ungroup() %>% filter(Day == 1) %>% select(-Id, -Day),
  Switzerland = psqi_switzerland_results %>% ungroup() %>% filter(Day == 1) %>% select(-code, -Day)
) %>% 
  list_rbind(names_to = "site")

var_label(psqi_data) <- psqi_labels %>% as.list()

psqi_data %>% 
  tbl_summary(by = site,
              missing_text = "missing",
              type = list(psqi_8 ~ "continuous"),
              statistic = list(psqi_8 ~ "{median} ({min},{max})")) %>% 
  gtsummary::add_p() %>% 
  add_significance_stars() %>% 
  as_gt() %>% 
  gtsave("figures/table_psqi_Day01_results.png")
The following errors were returned during `as_gt()`:
✖ For variable `psqi_1` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_2` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_3` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_4` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_5` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_6` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_7` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
✖ For variable `psqi_8` (`site`) and "p.value" statistic: The package "cardx"
  (>= 0.2.2) is required.
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b264a9c0edc.html screenshot completed

6.2 Tables Descriptives

#data preparation
metrics$data[[14]] <- metrics$data[[14]] %>% mutate(value = log10(MEDI))

metric_descriptive <- 
metrics %>%
  rowwise() %>% 
  mutate(data = list(data %>% 
           group_by(site) %>% 
           select(site, value)
         )
  ) %>% 
  filter(metric != "mean") %>% 
  unnest(data) %>% 
  mutate(site = site %>% fct_relabel(str_to_title)) %>% 
  group_by(metric, site) %>% 
  mutate(number = seq_along(value)) %>% 
  pivot_wider(names_from = site,
              values_from = value,
              values_fill = NA) %>% 
  select(-number)

#table generation
table_metrics_descriptive <- 
metric_descriptive %>% 
  summarize(across(everything(), 
                   list(
                    mean = \(x) mean(x, na.rm = TRUE),
                    sd = \(x) sd(x, na.rm = TRUE),
                    n = \(x) x[!is.na(x)] %>% length())
                   )
  ) %>% 
  mutate(
    difference = Malaysia_mean - Switzerland_mean
  ) %>% 
  gt(rowname_col = "metric") %>% 
  fmt_number(ends_with(c("mean", "sd", "difference")), n_sigfig = 3) %>%
  fmt_number(ends_with(c("mean", "sd", "difference")), decimals = 0,
             rows = c("FLiT1000", "FLiT250", "FcT250", "LE", 
                      "PAT1000")) %>% 
  fmt_number(ends_with("_n"), decimals = 0) %>% 
  fmt_scientific(ends_with(c("mean", "sd", "difference")),
                 rows = "SCmelEDI") %>% 
  fmt_duration(ends_with(c("mean", "sd", "difference")), 
               input_units = "seconds", duration_style = "colon-sep",
               output_units = c("hours", "minutes"),
           rows = 
             c("LLiT10", "LLiT250")
           ) %>% 
  fmt_duration(ends_with(c("mean", "sd", "difference")), 
               input_units = "seconds", duration_style = "narrow",
               output_units = c("hours", "minutes"),
           rows = 
             c("TAT1000", "TAT250", "TATd250", "TBTe10")
           ) %>% 
  tab_spanner(columns = contains(c("Malaysia")), label = "Malaysia") %>% 
  tab_spanner(columns = contains(c("Switzerland")), label = "Switzerland") %>% 
  cols_label(
    difference ~ "Difference M-S",
    contains("_mean") ~ "Mean",
    contains("_sd") ~ "SD",
    contains("_n") ~ "n"
  ) %>% 
  tab_style(locations = list(
                cells_body(columns = contains(c("_mean", "difference"))),
                cells_column_labels(),
                cells_column_spanners(),
                cells_stub()
                ),
                 style = cell_text(weight = "bold")) %>% 
    tab_style(
      style = cell_text(color = pal_jco()(1)),
      locations = cells_column_spanners(1)
    ) %>% 
    tab_style(
      style = cell_text(color = pal_jco()(2)[2]),
      locations = cells_column_spanners(2)
    ) %>% 
    cols_add(Plot = NA) %>% 
    cols_label(Plot = "Boxplot") %>% 
    text_transform(locations = cells_body(Plot),
                   fn = \(x) {
                       gt::ggplot_image(
                         {
                           metric_descriptive$metric %>% 
                             unique() %>%
                             map(\(x) {
                               metric_descriptive %>%
                                 pivot_longer(-metric, names_to = "site", values_to = "value") %>%
                                 filter(metric == x) %>%
                                 ggplot(aes(x = site, y = value, fill = site)) +
                                 geom_boxplot() +
                                 theme_void() +
                                 theme(legend.position = "none") +
                                 scale_fill_jco() +
                                 coord_flip()
                             })
                           
                         },
                         height = gt::px(50), aspect_ratio = 2
                       )
                   }) %>%
  tab_footnote("values were log10 transformed", 
               locations = cells_stub(c("L5m", "L5mde", "L5mn", "M10m", "MEDI")))

table_metrics_descriptive
Malaysia
Switzerland
Difference M-S Boxplot
Mean SD n Mean SD n
FLiT1000 38,664 8,857 413 35,909 8,818 489 2,755
FLiT250 37,192 9,723 439 33,204 9,880 509 3,988
FcT250 36 31 490 67 44 533 −31
IS 0.162 0.0581 19 0.164 0.0664 20 −0.00129
IV 1.29 0.483 472 1.28 0.504 504 0.00480
L5m1 −0.781 0.524 490 −0.966 0.183 533 0.185
L5mde1 0.836 0.785 490 0.971 0.912 533 −0.136
L5mn1 −0.341 0.897 490 −0.797 0.489 533 0.456
LE 5,117 7,132 490 18,765 26,249 533 −13,648
LLiT10 23:18 01:04 483 22:18 01:52 528 01:00
LLiT250 17:42 02:36 432 19:14 02:10 510 −01:31
M10m1 2.37 0.578 490 2.87 0.706 533 −0.503
MEDI1 0.469 1.42 704,123 0.547 1.61 765,617 −0.0788
MPratio 0.932 0.119 490 0.947 0.103 533 −0.0150
PAT1000 957 1,360 490 1,692 2,031 533 −736
SCmelEDI 1.52 × 10−3 1.36 × 10−4 490 1.40 × 10−3 1.19 × 10−4 533 1.15 × 10−4
TAT1000 47m 1h 4m 490 1h 44m 1h 41m 533 −57m
TAT250 1h 38m 1h 38m 490 3h 28m 2h 20m 533 −1h 50m
TATd250 1h 36m 1h 36m 490 3h 25m 2h 18m 533 −1h 49m
TBTe10 2h 32m 1h 5m 490 2h 39m 1h 1m 533 −7m
1 values were log10 transformed
gtsave(table_metrics_descriptive, "figures/Table_Metrics_Descriptive.png")

6.3 Tables Hypothesis results

H1_table %>% 
    cols_add(Plot2 = 1:nrow(inference_H1)) %>% 
    cols_hide(Plot) %>% 
    cols_label(Plot2 = "") %>% 
    text_transform(locations = cells_body(Plot2),
                   fn = \(x) {
                       gt::ggplot_image(
                         boxplot_function(as.numeric(x),inference_H1, value_dc),
                         height = gt::px(50),
                         aspect_ratio = 2
                         )
                   }) %>% 
  gtsave("figures/Table_H1.png")
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b26281115f7.html screenshot completed
H2_table_timing %>% 
    cols_add(Plot2 = 1:nrow(inference_H2)) %>% 
    cols_hide(Plot) %>% 
    cols_label(Plot2 = "") %>% 
    text_transform(locations = cells_body(Plot2),
                   fn = \(x) {
                       gt::ggplot_image(
                         boxplot_function(as.numeric(x),inference_H2, value),
                         height = gt::px(50),
                         aspect_ratio = 2
                         )
                   }) %>% 
  gtsave("figures/Table_H2_timing.png")
Warning: Removed 47 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Warning: Removed 12 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Warning: Removed 81 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Warning: Removed 47 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Warning: Removed 12 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Warning: Removed 81 rows containing non-finite outside the scale range
(`stat_boxplot()`).
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b2679b91fad.html screenshot completed
H2_table_interaction %>% 
  cols_width(
    metric ~ px(440),
    Day ~ px(120),
    p.value ~ px(120)
  ) %>% 
  gtsave("figures/Table_H2_interaction.png")
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_boxplot()`).
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMjIETH/file11b2668ced549.html screenshot completed

6.4 Figure 1 Overview

# data preparation
data_combined <-
  join_datasets(add.origin = TRUE,
                Malaysia = data$malaysia %>%
  mutate(across(c(Datetime, dusk, dawn), \(x) force_tz(x, "UTC"))),
  Switzerland = data$switzerland %>%
  mutate(across(c(Datetime, dusk, dawn), \(x) force_tz(x, "UTC")))
  ) %>%
  rename(site = dataset) %>%
  group_by(site, Id) %>%
  mutate(Id = fct_relabel(Id, \(x) str_remove(x, "ID|MY0")),
         Id = factor(Id, levels = sprintf("%02d",1:20)))

data_combined_original <-
  join_datasets(add.origin = TRUE,
                Malaysia = data_malaysia %>%
  mutate(across(c(Datetime), \(x) force_tz(x, "UTC"))),
  Switzerland = data_switzerland %>%
  mutate(across(c(Datetime), \(x) force_tz(x, "UTC")))
  ) %>%
  rename(site = dataset) %>%
  group_by(site, Id) %>%
  mutate(Id = fct_relabel(Id, \(x) str_remove(x, "ID|MY0")),
         Id = factor(Id, levels = sprintf("%02d",1:20)))

Photoperiods_combined <- 
data_combined %>% 
  group_by(site, Id, Day) %>% 
  filter(!marked_for_removal) %>%
  select(site, Id, Day, photoperiod_duration) %>%
  summarize(
    photoperiod_duration = first(photoperiod_duration)
  )
`summarise()` has grouped output by 'site', 'Id'. You can override using the
`.groups` argument.
data_1day <-
  data_combined %>%
  mutate(site = factor(site),
         dawn = hms::as_hms(dawn),
         dusk = hms::as_hms(dusk)) %>% 
  group_by(site) %>% 
  filter(!marked_for_removal) %>%
  select(site, Id, Datetime, MEDI, dusk, dawn) %>%
  aggregate_Date(
    unit = "15 mins",
    # unit = "5 mins",
    mean_MEDI = mean(log10(MEDI), na.rm = TRUE),
    numeric.handler = \(x) median(x, na.rm = TRUE),
    lower100 = min(MEDI, na.rm = TRUE),
    lower95 = quantile(MEDI, 0.025, na.rm = TRUE),
    lower75 = quantile(MEDI, 0.125, na.rm = TRUE),
    lower50 = quantile(MEDI, 0.25, na.rm = TRUE),
    upper50 = quantile(MEDI, 0.75, na.rm = TRUE),
    upper75 = quantile(MEDI, 0.875, na.rm = TRUE),
    upper95 = quantile(MEDI, 0.975, na.rm = TRUE),
    upper100 = max(MEDI, na.rm = TRUE),
    dawn = mean(dawn),
    dusk = mean(dusk),
    date.handler = \(x) as_date("1970-01-01")
  )

data_1day$dusk <- 
  data_1day$dusk %>% 
  hms::as_hms() %>% 
  {paste("1970-01-01", .)} %>% 
  parse_date_time(orders ="ymdHMS")
data_1day$dawn <- 
  data_1day$dawn %>% 
  hms::as_hms() %>% 
  {paste("1970-01-01", .)} %>% 
  parse_date_time(orders ="ymdHMS")

photoperiod_1day <-
  data_1day %>%
  summarize(dawn = first((dawn)),
            dusk = first((dusk)),
            .groups = "drop")

hlinelabel <-
  expand_grid(
    site = c("Malaysia", "Switzerland"),
    values = c(1, 10, 250)
  )
#preparing P5 comparison
singular_plot <-  
  data_1day %>% 
  ungroup() %>%
  gg_doubleplot(
    geom = "line", jco_color = FALSE, col = "black",
    alpha = 1) +
  geom_rect(data = photoperiod_1day,
   aes(xmin = as_datetime("1970-01-01", tz = "UTC"),
       xmax = dawn, ymin = -Inf, ymax = Inf), alpha = 0.25)+
  geom_rect(data = photoperiod_1day,
   aes(xmin = dusk, xmax = (dawn + ddays(1)), ymin = -Inf, ymax = Inf), alpha = 0.25)+
  geom_rect(data = photoperiod_1day,
   aes(xmin = dusk + ddays(1),
       xmax = as_datetime("1970-01-03", tz = "UTC"),
       ymin = -Inf, ymax = Inf), alpha = 0.25)+
  facet_wrap(~ site, ncol = 1) +
  theme(plot.subtitle = element_textbox_simple(padding = margin(b=10)))
Warning in eval(geom_function_expr)(ggplot2::aes(group = NULL, fill = NULL, :
Ignoring unknown parameters: `fill`
singular_plot <- 
singular_plot +
  # geom_ribbon(aes(ymin = lower100, ymax = upper100, fill = site), alpha = 0.4) +
  geom_ribbon(aes(ymin = lower95, ymax = upper95, fill = site), alpha = 0.4) +
  geom_ribbon(aes(ymin = lower75, ymax = upper75, fill = site), alpha = 0.4) +
  geom_ribbon(aes(ymin = lower50, ymax = upper50, fill = site), alpha = 0.4) +
  theme(panel.spacing.x = unit(2.5, "lines")) +
  scale_color_jco()+
  scale_fill_jco()

singular_plot$layers <-
  c(singular_plot$layers[2:7], singular_plot$layers[1])

singular_plot <-
singular_plot +
  map(c(1, 10, 250), 
      \(x) geom_hline(aes(yintercept = !!x), linetype = 2, , alpha = 0.25)
      ) +
  geom_label(data = hlinelabel, aes(y = values, label = values,
                                    x = as_datetime("1970-01-01 04:00:00", 
                                                    tz = "UTC")),
             size = 2.5)
#P1
#Image of Protocol
path <- "images/2024.12.05.ASEAN-Study-Design_v1.6.png"
P1 <- 
ggdraw() +
  draw_image(path)

#P2
#Timeframe of data collection
P2 <- 
data_combined_original %>% 
  filter(!is.na(MEDI)) %>% 
  mutate(Id = fct_rev(Id)) %>% 
  gg_overview() +
  # scale_y_discrete(labels = NULL) +
  scale_x_datetime(date_breaks = "1 month",
                   date_labels = "%b %y") +
  aes(col = site) +
  labs(y = "Participant ID") +
  scale_color_jco() +
  theme(legend.position = "none",
        axis.line.y = element_blank(),
        axis.ticks.y = element_blank())
Scale for x is already present.
Adding another scale for x, which will replace the existing scale.
#P3
#Locations
source("scripts/Worldmap.R", local = TRUE)

Attaching package: 'rnaturalearth'

The following object is masked from 'package:rnaturalearthdata':

    countries110

Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
Warning in layer_sf(data = data, mapping = mapping, stat = stat, geom = GeomLabel, : Ignoring unknown parameters: `box.padding`, `point.padding`, and
`segment.colour`
Warning in layer_sf(data = data, mapping = mapping, stat = stat, geom =
GeomLabel, : Ignoring unknown aesthetics: lat and lon
P3 <- 
P3 + labs(x=NULL, y = NULL)

#P4
#Photoperiods

P4 <- 
Photoperiods_combined %>% 
  mutate(site = factor(site)) %>% 
  ggplot() +
  aes(x = photoperiod_duration,
      y = fct_rev(site),
      fill = site,
      col = site) +
  geom_boxplot(alpha = 0.75)+
  scale_fill_jco() +
  scale_color_jco()+
  theme_minimal() +
  theme(legend.position = "none",
        axis.text.y = element_blank()) +
  labs(y = "", x = "Photoperiod duration (h)")

#P5
#average Daily pattern
P5 <- 
singular_plot + 
  theme_minimal()+
  theme(legend.position = "none",
        strip.text = element_blank()) +
  labs(y = "Melanopic illuminance (lx, mel EDI)", x = "Local time (HH:MM)")


#composit

P1/ (P3 + P2) / (P4 + P5 + plot_layout(widths = c(1, 2))) +
  plot_annotation(tag_levels = 'A') +
  plot_layout(guides = "collect", heights = c(3,2,2)) &
  theme(axis.title = element_text(size = 10),
        axis.text = element_text(size = 10),
        plot.tag = element_text(size = 20, face = "plain")) 
Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
give correct results for longitude/latitude data
Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
give correct results for longitude/latitude data

ggsave("figures/Figure_1.pdf", width = 17, height = 15, scale = 2, units = "cm")
Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
give correct results for longitude/latitude data
Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
give correct results for longitude/latitude data
# P5 + theme(plot.margin = margin(10,20,10,10),
#            axis.title = element_text(size = 12),
#         axis.text = element_text(size = 12))
# ggsave("figures/Average_Day.pdf", width = 8, height = 6, scale = 2, units = "cm")

6.5 Figure 2 Average day

data_1day <-
  data_combined %>%
  mutate(site = factor(site),
         dawn = hms::as_hms(dawn),
         dusk = hms::as_hms(dusk)) %>% 
  filter(!marked_for_removal) %>%
  select(site, Id, Datetime, MEDI, dusk, dawn) %>%
  aggregate_Date(
    unit = "15 mins",
    mean_MEDI = mean(log10(MEDI), na.rm = TRUE),
    sd_MEDI = sd(log10(MEDI), na.rm = TRUE),
    cv_MEDI = sd_MEDI/mean_MEDI,
    numeric.handler = \(x) median(x, na.rm = TRUE),
    lower95 = quantile(MEDI, 0.025, na.rm = TRUE),
    lower75 = quantile(MEDI, 0.125, na.rm = TRUE),
    lower50 = quantile(MEDI, 0.25, na.rm = TRUE),
    upper50 = quantile(MEDI, 0.75, na.rm = TRUE),
    upper75 = quantile(MEDI, 0.875, na.rm = TRUE),
    upper95 = quantile(MEDI, 0.975, na.rm = TRUE),
    dawn = mean(dawn, na.rm = TRUE),
    dusk = mean(dusk, na.rm = TRUE),
    date.handler = \(x) as_date("1970-01-01"),
    
  )

data_1day$dusk <- 
  data_1day$dusk %>% 
  hms::as_hms() %>% 
  {paste("1970-01-01", .)} %>% 
  parse_date_time(orders ="ymdHMS")
data_1day$dawn <- 
  data_1day$dawn %>% 
  hms::as_hms() %>% 
  {paste("1970-01-01", .)} %>% 
  parse_date_time(orders ="ymdHMS")

IS_data <- 
  metrics$data[metrics$metric == "IS"][[1]] %>% 
  mutate(Id = fct_relabel(Id, \(x) str_remove(x, "ID|MY0")),
         Id = factor(Id, levels = sprintf("%02d",1:20))) %>% 
  mutate(site = site %>% fct_relabel(str_to_title))
photoperiod_1day <-
  data_1day %>%
  summarize(dawn = first((dawn)),
            dusk = first((dusk)),
            .groups = "drop")

hlinelabel <-
  expand_grid(
    site = c("Malaysia", "Switzerland"),
    Id =  sprintf("%02d", 1:20),
    values = c(1, 10, 250)
  ) %>%
  filter(!(site == "Malaysia" & Id == "06"))

facet_names <- c(Malaysia = "Kuala Lumpur, Malaysia", 
                 Switzerland = "Basel, Switzerland")

singular_plot <-  
  data_1day %>% 
  ungroup() %>% 
  gg_doubleplot(
    geom = "line", jco_color = FALSE, facetting = FALSE, col = "black",
    alpha = 1,
    y.axis.breaks = c(0, 1, 10, 1000, 100000)) +
  geom_rect(data = photoperiod_1day,
   aes(xmin = as_datetime("1970-01-01", tz = "UTC"),
       xmax = dawn, ymin = -Inf, ymax = Inf), alpha = 0.25)+
  geom_rect(data = photoperiod_1day,
   aes(xmin = dusk, xmax = (dawn + ddays(1)), ymin = -Inf, ymax = Inf), alpha = 0.25)+
  geom_rect(data = photoperiod_1day,
   aes(xmin = dusk + ddays(1),
       xmax = as_datetime("1970-01-03", tz = "UTC"),
       ymin = -Inf, ymax = Inf), alpha = 0.25)+
  facet_grid2(Id ~ site, switch = "y",
             labeller = labeller(site = facet_names),
             strip = 
               strip_themed(background_x = elem_list_rect(fill = pal_jco()(2)))) +
  labs(fill = "Percentiles", 
       title = "Double plots of average days per participant ID and site",
       subtitle = 
         "Median and percentiles of light exposure per 15 minutes (<span style='color:#ee0000;'>95%</span>, <span style='color:#ee0000CC;'>75%</span>, and <span style='color:#ee0000AA;'>50%</span> of data shown in ribbons). Grey backgrounds indicate (civil) nighttime. "
       )+
  theme(plot.subtitle = element_textbox_simple(padding = margin(b=10)))
Warning in eval(geom_function_expr)(ggplot2::aes(group = NULL, fill = NULL, :
Ignoring unknown parameters: `fill`
singular_plot <- 
singular_plot +
  geom_ribbon(aes(ymin = lower95, ymax = upper95), fill = "red2", alpha = 0.25) +
  geom_ribbon(aes(ymin = lower75, ymax = upper75), fill = "red2", alpha = 0.25) +
  geom_ribbon(aes(ymin = lower50, ymax = upper50), fill = "red2", alpha = 0.25) +
    geom_label(data = IS_data,
            aes(x = as_datetime("1970-01-03", tz = "UTC") - dminutes(30), 
                y = 10^5, label = paste("IS =", value %>% round(2), "")),
            hjust = 1,
            vjust = 1,
            alpha = 0.75
            ) +
  theme(panel.spacing.x = unit(2.5, "lines")) +
  ggplot2::scale_y_continuous(trans = "symlog", 
    breaks = c(0, 10^(0:5)), labels = 
      c("0", "", "10", "", "1 000", "", "100 000")
  )
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
singular_plot$layers <-
  c(singular_plot$layers[2:8], singular_plot$layers[1])

singular_plot <-
singular_plot +
  map(c(1, 10, 250), 
      \(x) geom_hline(aes(yintercept = !!x), linetype = 2, , alpha = 0.25)
      ) +
  geom_label(data = hlinelabel, aes(y = values, label = values,
                                    x = as_datetime("1970-01-01 04:00:00", 
                                                    tz = "UTC")),
             size = 2.5) +
  labs(x = "Local time (HH:MM)", y = "Melanopic illuminance (lx, mel EDI)")
  

singular_plot

ggsave("figures/Figure_2.pdf", singular_plot, 
       width = 17, height = 25, units = "cm",scale = 2)

6.6 Figure 3 Daily patterns

light_pattern <-
plot_smooth(
  Pattern_model,
  view = "Time",
  plot_all = "site",
  rug = F,
  n.grid = 90,
  col = pal_jco()(2),
  rm.ranef = "s(Id)",
  sim.ci = TRUE,
  )
Summary:
    * site : factor; set to the value(s): malaysia, switzerland. 
    * Time : numeric predictor; with 200 values ranging from 0.000000 to 23.983333. 
    * Id : factor; set to the value(s): MY012. 
    * NOTE : The following random effects columns are canceled: s(Time):sitemalaysia,s(Time):siteswitzerland,s(Id):sitemalaysia,s(Id):siteswitzerland
 
    * Simultaneous 95%-CI used : 
        Critical value: 2.344
        Proportion posterior simulations in pointwise CI: 0.87 (10000 samples)
        Proportion posterior simulations in simultaneous CI: 0.95 (10000 samples)
 

light_pattern_diff <- 
plot_diff(Pattern_model,
          view = "Time", 
          rm.ranef = "s(Id)",
          comp = list(site = c("malaysia", "switzerland")),
          sim.ci = TRUE,
          n.grid = 10000)
Summary:
    * Time : numeric predictor; with 10000 values ranging from 0.000000 to 23.983333. 
    * Id : factor; set to the value(s): MY012. 
    * NOTE : The following random effects columns are canceled: s(Time):sitemalaysia,s(Time):siteswitzerland,s(Id):sitemalaysia,s(Id):siteswitzerland
 
    * Simultaneous 95%-CI used : 
        Critical value: 2.084
        Proportion posterior simulations in pointwise CI: 1 (10000 samples)
        Proportion posterior simulations in simultaneous CI: 1 (10000 samples)
 


Time window(s) of significant difference(s):
    0.000000 - 4.108756
    7.330040 - 19.819410
    20.946740 - 23.983333
light_pattern_individuals <- 
plot_smooth(
  Pattern_modelm1,
  view = "Time",
  plot_all = "Id",
  rm.ranef = FALSE,
  se = 0,
  rug = F,
  n.grid = 90,
  col = c(rep(colors_pattern[1], 19),rep(colors_pattern[2], 20)),
  sim.ci = TRUE
  )
Summary:
    * Time : numeric predictor; with 200 values ranging from 0.000000 to 23.983333. 
    * Id : factor with 39 values; set to the value(s): ID01, ID02, ID03, ID04, ID05, ID06, ID07, ID08, ID09, ID10, ... 
    * site : factor; set to the value(s): switzerland. 
    * Simultaneous 95%-CI used : 
        Critical value: 3.953
        Proportion posterior simulations in pointwise CI: 1 (10000 samples)
        Proportion posterior simulations in simultaneous CI: 0.95 (10000 samples)
 

time_labels <- c("00:00", "06:00", "12:00", "18:00", "24:00")

P1 <- 
light_pattern$fv %>% 
  mutate(site = str_to_title(site)) %>% 
  ggplot(aes(x = Time*60*60, y = fit, col = site)) +
  geom_ribbon(aes(ymin = ll, ymax = ul, fill = site), alpha = 0.25, col = NA)+
  geom_line() +
  theme_minimal() +
  scale_color_jco()+
  scale_fill_jco()+
  labs(y = "Model fit with 95% CI, log10(mel EDI, lx)", x = "Local Time (HH:MM)", 
       color = "Site", fill = "Site") +
  scale_x_time(limits = c(0, 24*60*60), expand = c(0,0),
               breaks = c(0, 6, 12, 18, 24)*60*60, labels = time_labels
               )

label_sig <- "Credible non-\nzero difference"

P2 <- 
light_pattern_diff %>% 
  mutate(significant = (est-sim.CI < 0) & (0 < est + sim.CI)) %>% 
  ggplot(aes(x = Time*60*60, y = est, group = consecutive_id(significant) )) +
  geom_ribbon(aes(ymin = est-sim.CI, ymax = est+sim.CI, fill = !significant), 
              alpha = 0.25, col = NA)+
  geom_line(aes(col = !significant)) +
  geom_hline(aes(yintercept = 0)) +
  theme_minimal() +
  labs(y = "Fitted difference smooth with 95% CI", x = NULL,
       col = label_sig, fill = label_sig) +
  scale_x_time(limits = c(0, 24*60*60), expand = c(0,0),
               breaks = c(0, 6, 12, 18, 24)*60*60, labels = time_labels)

P3 <- 
light_pattern_individuals$fv %>% 
  mutate(site = str_detect(Id, "MY"),
         site = if_else(site, "Malaysia", "Switzerland")) %>% 
  ggplot(aes(x = Time*60*60, y = fit, col = site, group = Id)) +
  # geom_ribbon(aes(ymin = ll, ymax = ul, fill = site), alpha = 0.25, col = NA)+
  geom_line() +
  theme_minimal() +
  scale_color_jco()+
  scale_fill_jco()+
  labs(y = "Model fit, log10(mel EDI, lx)", x = NULL, col = "Site", fill = "Site") +
  scale_x_time(limits = c(0, 24*60*60), expand = c(0,0),
               breaks = c(0, 6, 12, 18, 24)*60*60, labels = time_labels) +
  theme(legend.position = "none")

(P3 + P1 + P2) +
  plot_annotation(tag_levels = "A") +
  plot_layout(guides = "collect")

ggsave("figures/Figure_3.pdf", 
       width = 17, height = 7, units = "cm", scale = 1.6)

6.7 Figure 4 Correlation Matrix

#visualization
(corr_plot_malaysia + theme(legend.position = "none")) + 
  (corr_plot_switzerland) + 
  # plot_layout(guides = "collect")+
  theme(legend.position = "bottom",
        legend.justification = c(1,1))

ggsave("figures/Figure_4.pdf", 
       width = 25, height = 17, units = "cm", scale = 1.6)

6.8 S1 Missing time: Bootstrap results

bootstrap_0 <- 
bootstrap_comparison %>% 
  filter(threshold == 1) %>% 
  mutate(across(contains("bs"), \(x) mean),
         threshold = 0)

bootstrap_0 %>% 
  rbind(bootstrap_comparison) %>%
  ggplot(aes(x = threshold)) +
    geom_ribbon(aes(ymin = (mean - 2*se), ymax = (mean + 2*se)), 
                fill = "steelblue", alpha = 0.4) +
  geom_hline(data = subset_metrics, aes(yintercept=mean), color = "steelblue") +
  geom_errorbar(
    aes(ymin = lower_bs1, ymax = upper_bs1), linewidth = 0.5, width = 0) +
  geom_errorbar(aes(ymin = lower_bs2, ymax = upper_bs2), 
                linewidth = 0.25, width = 0) +
  geom_point(aes(y=mean_bs, 
                 col = ((mean - 2*se) <= lower_bs3 & upper_bs3 <= (mean + 2*se)))) +
  facet_wrap(Id~metric, scales = "free") +
  theme_minimal() +
  labs(x = "Hours missing from the day", y = "Metric value", col = "Within Range") +
  theme(legend.position = "bottom") +
  coord_cartesian(xlim = c(0, 21), expand = FALSE, clip = FALSE) +
  geom_text(
    data = 
      bootstrap_comparison %>% 
      filter((mean - 2*se) <= lower_bs3 & upper_bs3 <= (mean + 2*se)) %>% 
      group_by(metric, Id) %>% 
      filter(threshold == max(threshold)),
    aes(label = threshold, x = threshold, y = mean_bs),
    size = 2.5,
  )

ggsave("figures/Figure_S1.pdf", 
       width = 10, height = 7, units = "cm", scale = 2)

7 Session Info

sessionInfo()
R version 4.4.2 (2024-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.1.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] sf_1.0-19               rnaturalearth_1.0.1     rnaturalearthdata_1.0.0
 [4] ggh4x_0.3.0             magick_2.8.5            ggtext_0.1.2           
 [7] ordinal_2023.12-4.1     labelled_2.14.0         itsadug_2.4.1          
[10] plotfunctions_1.4       mgcv_1.9-1              nlme_3.1-166           
[13] magrittr_2.0.3          lattice_0.22-6          broom.mixed_0.2.9.6    
[16] lmerTest_3.1-3          lme4_1.1-35.5           Matrix_1.7-1           
[19] suntools_1.0.1          rlang_1.1.4             LightLogR_0.3.8        
[22] ggcorrplot_0.1.4.1      ggsci_3.2.0             readxl_1.4.3           
[25] patchwork_1.3.0         gtsummary_2.0.4         gt_0.11.1              
[28] cowplot_1.1.3           lubridate_1.9.4         forcats_1.0.0          
[31] stringr_1.5.1           dplyr_1.1.4             purrr_1.0.2            
[34] readr_2.1.5             tidyr_1.3.1             tibble_3.2.1           
[37] ggplot2_3.5.1           tidyverse_2.0.0         quarto_1.4.4           
[40] pacman_0.5.1           

loaded via a namespace (and not attached):
 [1] rstudioapi_0.17.1   jsonlite_1.8.9      farver_2.1.2       
 [4] nloptr_2.1.1        rmarkdown_2.29      fs_1.6.5           
 [7] ragg_1.3.3          vctrs_0.6.5         minqa_1.2.8        
[10] terra_1.7-78        base64enc_0.1-3     htmltools_0.5.8.1  
[13] haven_2.5.4         broom_1.0.7         cellranger_1.1.0   
[16] sass_0.4.9          parallelly_1.41.0   KernSmooth_2.23-24 
[19] htmlwidgets_1.6.4   commonmark_1.9.2    lifecycle_1.0.4    
[22] pkgconfig_2.0.3     webshot2_0.1.1      R6_2.5.1           
[25] fastmap_1.2.0       future_1.34.0       digest_0.6.37      
[28] numDeriv_2016.8-1.1 colorspace_2.1-1    furrr_0.3.1        
[31] ps_1.8.1            warp_0.2.1          textshaping_0.4.1  
[34] labeling_0.4.3      fansi_1.0.6         timechange_0.3.0   
[37] httr_1.4.7          compiler_4.4.2      proxy_0.4-27       
[40] bit64_4.5.2         withr_3.0.2         backports_1.5.0    
[43] DBI_1.2.3           MASS_7.3-61         ucminf_1.2.2       
[46] classInt_0.4-10     tools_4.4.2         chromote_0.4.0     
[49] units_0.8-5         glue_1.8.0          promises_1.3.2     
[52] gridtext_0.1.5      grid_4.4.2          generics_0.1.3     
[55] gtable_0.3.6        tzdb_0.4.0          class_7.3-22       
[58] websocket_1.4.2     hms_1.1.3           xml2_1.3.6         
[61] utf8_1.2.4          pillar_1.9.0        markdown_1.13      
[64] vroom_1.6.5         later_1.4.1         splines_4.4.2      
[67] renv_1.0.11         bit_4.5.0           tidyselect_1.2.1   
[70] knitr_1.49          xfun_0.49           stringi_1.8.4      
[73] yaml_2.3.10         boot_1.3-31         evaluate_1.0.1     
[76] codetools_0.2-20    cli_3.6.3           systemfonts_1.2.0  
[79] munsell_0.5.1       processx_3.8.5      Rcpp_1.0.14        
[82] globals_0.16.3      parallel_4.4.2      slider_0.3.2       
[85] listenv_0.9.1       viridisLite_0.4.2   scales_1.3.0       
[88] e1071_1.7-16        crayon_1.5.3        cards_0.4.0        

Footnotes

  1. The spectral contribution will be calculated from melanopic Irradiance, i.e. melanopic EDI divided by 753.35 (maximum melanopic luminous efficacy of radiation) and divided by the sum of irradiance across all wavelengths.↩︎